Advanced search

Advanced search in people

From:

To:

Items from 21 to 40 out of 937 results

chapter

Novel alignment method for DNN TTS training using HMM synthesis models

Sinisa Suzic, Tijana Delic, Darko Pekar, Vladimir Ostojic

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) > 271 - 276

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY)

In order to train neural networks (NN) for text-to-speech synthesis (TTS), phonetic segmentation must be performed. The most accurate segmentation is performed manually, but the process of creating manual alignments is costly and time-consuming, so automatic procedures are preferable. In this paper, a simple alignment method based on models trained during hidden Markov Model (HMM) based TTS system...

chapter

A hardware/software co-design architecture for ultrasonic flaw detection with Hidden Markov Model and wavelet transform

Kushal Virupakshappa, Erdal Oruklu

2017 IEEE International Ultrasonics Symposium (IUS) > 1

2017 IEEE International Ultrasonics Symposium (IUS)

This work presents an embedded hardware architecture for real-time ultrasonic NDE applications that incorporate Hidden Markov Model (HMM) based statistical signal methods. HMM has been successfully used in applications like audio segment retrieval, speech/language recognition and image processing applications. Recently, we proposed a new Hidden Markov Model (HMM) based ultrasonic flaw detection algorithm...

chapter

A hardware/software co-design architecture for ultrasonic flaw detection with Hidden Markov Model and Wavelet Transform

Kushal Virupakshappa, Erdal Oruklu

2017 IEEE International Ultrasonics Symposium (IUS) > 1 - 4

2017 IEEE International Ultrasonics Symposium (IUS)

This work presents an embedded hardware architecture for real-time ultrasonic NDE applications that incorporate Hidden Markov Model (HMM) based statistical signal methods. Proposed algorithm is a combination of Discrete Wavelet Transform (DWT) for pre-processing A-scan signals and HMM for classification of the flaw presence. For this study, a MicroZed FPGA with Xilinx Zynq-7020 System-on-Chip (SoC)...

chapter

Leveraging deep neural networks with nonnegative representations for improved environmental sound classification

Victor Bisot, Romain Serizel, Slim Essid, Gael Richard

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

This paper introduces the use of representations based on nonnegative matrix factorization (NMF) to train deep neural networks with applications to environmental sound classification. Deep learning systems for sound classification usually rely on the network to learn meaningful representations from spectrograms or hand-crafted features. Instead, we introduce a NMF-based feature learning stage before...

chapter

Speech recognition features based on deep latent Gaussian models

Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) > 1 - 6

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

This paper constructs speech features based on a generative model using a deep latent Gaussian model (DLGM), which is trained using stochastic gradient variational Bayes (SGVB) algorithm and performs efficient approximate inference and learning with a directed probabilistic graphical model. The trained DLGM then generate latent variables based on Gaussian distribution, which is used as new features...

chapter

Sensor characteristic invariant feature for acoustic stationary pattern classification

S. Thirachai, S. Khomsay, J. Suwatthikul

2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE) > 141 - 144

2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE)

A calibration of various microphones that have different characteristics is very difficult. This paper presents a feature extraction method as an alternative. The method provides acoustic features that are strongly robust against various characteristic transfer functions. The proposed method applies Local Binary Patterns (LBP) and Compressive Sensing (CS) which compare spectral details with spectral...

chapter

Audio/video supervised independent vector analysis through multimodal pilot dependent components

Francesco Nesta, Saeed Mosayyebpour, Zbynek Koldovsky, Karel Palecek

2017 25th European Signal Processing Conference (EUSIPCO) > 1150 - 1164

2017 25th European Signal Processing Conference (EUSIPCO)

Independent Vector Analysis is a powerful tool for estimating the broadband acoustic transfer function between multiple sources and the microphones in the frequency domain. In this work, we consider an extended IVA model which adopts the concept of pilot dependent signals. Without imposing any constraint on the de-mixing system, pilot signals depending on the target source are injected into the model...

chapter

Detection of alarm sounds in noisy environments

Dean Carmel, Ariel Yeshurun, Yair Moshe

2017 25th European Signal Processing Conference (EUSIPCO) > 1839 - 1843

2017 25th European Signal Processing Conference (EUSIPCO)

Sirens and alarms play an important role in everyday life since they warn people of hazardous situations, even when these are out of sight. Automatic detection of this class of sounds can help hearing impaired or distracted people, e.g., on the road, and contribute to their independence and safety. In this paper, we present a technique for the detection of alarm sounds in noisy environments. The technique...

chapter

Automatic detection of bird species from audio field recordings using HMM-based modelling of frequency tracks

Peter Jancovic, Munevver Kokuer

2017 25th European Signal Processing Conference (EUSIPCO) > 1779 - 1783

2017 25th European Signal Processing Conference (EUSIPCO)

This paper presents an automatic system for detection of bird species in field recordings. A sinusoidal detection algorithm is employed to segment the acoustic scene into isolated spectro-temporal segments. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid, referred to as frequency track. Each bird species is represented by a set of hidden Markov models (HMMs),...

chapter

FPGA implementation of a support vector machine classifier for Ultrasonic flaw detection

Yiyue Jiang, Kushal Virupakshappa, Erdal Oruklu

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS) > 180 - 183

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

In this work, we investigate the hardware implementation of Support Vector Machine (SVM) prediction on an FPGA platform for industrial ultrasound applications. Specifically, SVM is used as classifier for identifying ultrasonic A-scan signals as signals with flaw or signals without flaw. Hardware acceleration using FPGA is the main theme of the presented work. The architecture used to implement the...

chapter

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Noboru Harada

2017 25th European Signal Processing Conference (EUSIPCO) > 698 - 702

2017 25th European Signal Processing Conference (EUSIPCO)

We propose a method for optimizing an acoustic feature extractor for anomalous sound detection (ASD). Most ASD systems adopt outlier-detection techniques because it is difficult to collect a massive amount of anomalous sound data. To improve the performance of such outlier-detection-based ASD, it is essential to extract a set of efficient acoustic features that is suitable for identifying anomalous...

chapter

A neural network approach for sound event detection in real life audio

Michele Valenti, Dario Tonelli, Fabio Vesperini, Emanuele Principi, more

2017 25th European Signal Processing Conference (EUSIPCO) > 2754 - 2758

2017 25th European Signal Processing Conference (EUSIPCO)

This paper presents and compares two algorithms based on artificial neural networks (ANNs) for sound event detection in real life audio. Both systems have been developed and evaluated with the material provided for the third task of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 challenge. For the first algorithm, we make use of an ANN trained on different features extracted...

chapter

Development of multilingual phone recognition system for Indian languages

K E Manjunath, K. Sreenivasa Rao, Dinesh Babu Jayagopi

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

In this paper, the development of Multilingual Phone Recognition System (MPRS) in the context of Indian languages is described. MPRS is a language independent Phone Recognition System (PRS) that could recognise the phonetic units present in a speech utterance of any language. We have developed two Bilingual and a quadrilingual PRS using four Indian languages — Kannada, Telugu, Bengali, and Odia. International...

chapter

Automated detection of geometric defects on connecting rod via acoustic resonance testing

Yun Zheng, Matthias Heinrich, Ahmad Osman, Bernd Valeske

2017 25th European Signal Processing Conference (EUSIPCO) > 1868 - 1872

2017 25th European Signal Processing Conference (EUSIPCO)

Fully automated defect detection and classification of automobile components are crucial for solving quality and efficiency problems for automotive manufacturers, due to the rising wage, production costs and warranty claims. However, metrological deviations in form still represent unsolved problems using state-of-the-art techniques, especially for forged or casted components with complex geometry...

chapter

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Pawel Rosciszewski, Jakub Kaliski

2017 International Conference on High Performance Computing & Simulation (HPCS) > 560 - 565

2017 International Conference on High Performance Computing & Simulation (HPCS)

In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training...

chapter

An improved residual LSTM architecture for acoustic modeling

Lu Huang, Ji Xu, Jiasong Sun, Yi Yang

2017 2nd International Conference on Computer and Communication Systems (ICCCS) > 101 - 105

2017 2nd International Conference on Computer and Communication Systems (ICCCS)

Long Short-Term Memory (LSTM) is the primary recurrent neural networks architecture for acoustic modeling in automatic speech recognition systems. Residual learning is an efficient method to help neural networks converge easier and faster. In this paper, we propose several types of residual LSTM methods for our acoustic modeling. Our experiments indicate that, compared with classic LSTM, our architecture...

chapter

Select-additive learning: Improving generalization in multimodal sentiment analysis

Haohan Wang, Aaksha Meghawat, Louis-Philippe Morency, Eric P. Xing

2017 IEEE International Conference on Multimedia and Expo (ICME) > 949 - 954

2017 IEEE International Conference on Multimedia and Expo (ICME)

Multimodal sentiment analysis is drawing an increasing amount of attention these days. It enables mining of opinions in video reviews which are now available aplenty on online platforms. However, multimodal sentiment analysis has only a few high-quality data sets annotated for training machine learning algorithms. These limited resources restrict the generalizability of models, where, for example,...

chapter

Random forest classification based acoustic event detection

Xianjun Xia, Roberto Togneri, Ferdous Sokel, David Huang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 163 - 168

2017 IEEE International Conference on Multimedia and Expo (ICME)

This paper deals with the acoustic event detection (AED) to improve the detection accuracy of acoustic events. Acoustic event detection task is performed by a regression via classification (RvC) based approach along with the random forest technique. A discretization process is used to convert the continuous frame positions within acoustic events into event duration class labels. Outputs of the category-specific...

chapter

Random forest regression based acoustic event detection with bottleneck features

Xianjun Xia, Roberto Togneri, Ferdous Sohel, David Huang

2017 IEEE International Conference on Multimedia and Expo (ICME) > 157 - 162

2017 IEEE International Conference on Multimedia and Expo (ICME)

This paper deals with random forest regression based acoustic event detection (AED) by combining acoustic features with bottleneck features (BN). The bottleneck features have a good reputation of being inherently discriminative in acoustic signal processing. To deal with the unstructured and complex real-world acoustic events, an acoustic event detection system is constructed using bottleneck features...

chapter

Improving acoustic modeling using audio-visual speech

Ahmed Hussen Abdelaziz

2017 IEEE International Conference on Multimedia and Expo (ICME) > 1081 - 1086

2017 IEEE International Conference on Multimedia and Expo (ICME)

Reliable visual features that encode the articulator movements of speakers can dramatically improve the decoding accuracy of automatic speech recognition systems when combined with the corresponding acoustic signals. In this paper, a novel framework is proposed to utilize audio-visual speech not only during decoding but also for training better acoustic models. In this framework, a multi-stream hidden...

Keywords:
TRAINING
ACOUSTICS

Publication date

Set your own date range

Content availability

Available (936)
None (1)

Publication type

book (816)
article (121)

Keywords

SPEECH (566)
HIDDEN MARKOV MODELS (487)
SPEECH RECOGNITION (429)
FEATURE EXTRACTION (218)
DATA MODELS (151)
ADAPTATION MODELS (103)
SPEECH PROCESSING (98)
TRAINING DATA (95)
NEURAL NETWORKS (93)
ACCURACY (90)
COMPUTATIONAL MODELING (85)
AUTOMATIC SPEECH RECOGNITION (75)
SUPPORT VECTOR MACHINES (74)
ARTIFICIAL NEURAL NETWORKS (71)
DATABASES (65)
VECTORS (58)
TESTING (56)
DECODING (54)
ADAPTATION MODEL (49)
NATURAL LANGUAGE PROCESSING (49)
MATHEMATICAL MODEL (47)
SPEAKER RECOGNITION (47)
ACOUSTIC SIGNAL PROCESSING (46)
NOISE (44)
ACOUSTIC MODELING (43)
CONTEXT (43)
HIDDEN MARKOV MODEL (42)
SPEECH SYNTHESIS (41)
ESTIMATION (40)
ROBUSTNESS (40)
DATA MINING (39)
SIGNAL PROCESSING (39)
DEEP NEURAL NETWORKS (38)
DEEP NEURAL NETWORK (36)
DISCRIMINATIVE TRAINING (36)
MAXIMUM LIKELIHOOD ESTIMATION (35)
LATTICES (34)
LEARNING (ARTIFICIAL INTELLIGENCE) (34)
TRANSFORMS (34)
ERROR ANALYSIS (32)
CLASSIFICATION ALGORITHMS (31)
VOCABULARY (31)
SIGNAL TO NOISE RATIO (29)
VISUALIZATION (28)
ACOUSTIC MODEL (27)
CONTEXT MODELING (27)
MACHINE LEARNING (26)
EMOTION RECOGNITION (25)
KERNEL (25)
DICTIONARIES (24)
NOISE MEASUREMENT (24)
STANDARDS (24)
OPTIMIZATION (23)
PATTERN RECOGNITION (23)
EQUATIONS (22)
GAUSSIAN PROCESSES (22)
INDEXES (22)
ALGORITHM DESIGN AND ANALYSIS (21)
EDUCATIONAL INSTITUTIONS (21)
MICROPHONES (21)
PROBABILITY (21)
SIGNAL PROCESSING ALGORITHMS (21)
CLUSTERING ALGORITHMS (20)
CONFERENCES (20)
RECURRENT NEURAL NETWORKS (20)
SPEAKER ADAPTATION (20)
COMPUTERS (19)
CORRELATION (19)
HMM (19)
SUPPORT VECTOR MACHINE CLASSIFICATION (18)
COMPUTER ARCHITECTURE (17)
GAUSSIAN MIXTURE MODEL (17)
UNSUPERVISED LEARNING (17)
COMPLEXITY THEORY (16)
DETECTORS (16)
LANGUAGE MODEL (16)
NEURAL NETS (16)
PRAGMATICS (16)
CONVOLUTION (15)
MEASUREMENT (15)
NIST (15)
PATTERN CLASSIFICATION (15)
ACOUSTIC MEASUREMENTS (14)
EVENT DETECTION (14)
KEYWORD SEARCH (14)
NEURONS (14)
PREDICTIVE MODELS (14)
SPEECH ENHANCEMENT (14)
VOICE CONVERSION (14)
JOINTS (13)
LVCSR (13)
MEL FREQUENCY CEPSTRAL COEFFICIENT (13)
SPEECH CODING (13)
SUPPORT VECTOR MACHINE (13)
TRAJECTORY (13)
APPROXIMATION METHODS (12)
DEEP LEARNING (12)
DNN (12)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Novel alignment method for DNN TTS training using HMM synthesis models

A hardware/software co-design architecture for ultrasonic flaw detection with Hidden Markov Model and wavelet transform

A hardware/software co-design architecture for ultrasonic flaw detection with Hidden Markov Model and Wavelet Transform

Leveraging deep neural networks with nonnegative representations for improved environmental sound classification

Speech recognition features based on deep latent Gaussian models

Sensor characteristic invariant feature for acoustic stationary pattern classification

Audio/video supervised independent vector analysis through multimodal pilot dependent components

Detection of alarm sounds in noisy environments

Automatic detection of bird species from audio field recordings using HMM-based modelling of frequency tracks

FPGA implementation of a support vector machine classifier for Ultrasonic flaw detection

Optimizing acoustic feature extractor for anomalous sound detection based on Neyman-Pearson lemma

A neural network approach for sound event detection in real life audio

Development of multilingual phone recognition system for Indian languages

Automated detection of geometric defects on connecting rod via acoustic resonance testing

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

An improved residual LSTM architecture for acoustic modeling

Select-additive learning: Improving generalization in multimodal sentiment analysis

Random forest classification based acoustic event detection

Random forest regression based acoustic event detection with bottleneck features

Improving acoustic modeling using audio-visual speech

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options