Search results

Items from 1 to 20 out of 43 results

chapter

Robust hands-free Automatic Speech Recognition for human-machine interaction

R Gomez, T Kawahara, K Nakadai

2010 10th IEEE-RAS International Conference on Humanoid Robots > 138 - 143

2010 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010)

In enclosed environments where robots are deployed, the observed speech signal is smeared due to reverberation. This degrades the performance of the automatic speech recognition (ASR). Thus, hands-free speech recognition for human-machine communication is a difficult task. Most speech enhancement techniques used to address this problem enhance the contaminated waveform independent from that of the...

chapter

Novel active learning sample evaluation method based on multi-level confusion networks

Wei Chen, Gang Liu, Jun Guo

2010 2nd IEEE InternationalConference on Network Infrastructure and Digital Content > 134 - 139

2010 2nd IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC 2010)

Active Learning (AL) is designed to aid the labor-intensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to evaluate the unlabeled samples is worth researching. In this paper, we propose a unified framework to generate confusion networks of multiple levels including character, syllable and...

chapter

Automatic speaker verification experiments using HMM

Doru-Petru Munteanu, Stefan-Adrian Toma

2010 8th International Conference on Communications > 107 - 110

2010 8th International Conference on Communications (COMM)

This paper addresses the design and implementation of automatic speaker verification (ASV) systems. There is great interest in developing and increasing the performance of ASV applications, taking into account the advantages offered when compared to other biometrical methods. State-of-the-art speaker recognizers are based on statistical models such as GMM, HMM, SVM, ANN or hybrid models. This work...

chapter

North Atlantic Right Whale acoustic signal processing: Part I. comparison of machine learning recognition algorithms

Peter J Dugan, Aaron N Rice, Ildar R Urazghildiiev, Christopher W Clark

2010 IEEE Long Island Systems, Applications and Technology Conference > 1 - 6

2010 IEEE Long Island Systems, Applications and Technology Conference (LISAT 2010)

This paper compares three different approaches currently used in recognizing contact calls made from the North Atlantic Right Whale (NRW), Eubalaena glacialis. We present two new approaches consisting of machine learning algorithms based on artificial neural networks (NET) and the classification and regression tree classifiers (CART), and compare their performance with earlier work that employs multi-Stage...

chapter

North Atlantic right whale acoustic signal processing: Part II. improved decision architecture for auto-detection using multi-classifier combination methodology

Peter J Dugan, Aaron N Rice, Ildar R Urazghildiiev, Christopher W Clark

2010 IEEE Long Island Systems, Applications and Technology Conference > 1 - 6

2010 IEEE Long Island Systems, Applications and Technology Conference (LISAT 2010)

Autonomous signal detection of the North Atlantic right whale (NRW), Eubalaena glacialis, is becoming an important factor in monitoring and conservation for this highly endangered species. Both online and offline systems exist to help study and protect animals within this population. In both cases auto-detection of species-specific calls plays a vital role in localizing individual animal by searching...

chapter

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

Dan Su, Xihong Wu, Lei Xu

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4890 - 4893

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper investigates the Bayesian Ying-Yang (BYY) learning for speech recognition via Gaussian mixture models (GMMs) based Hidden Markov models (HMMs). A two level procedure is proposed with the hidden Markov level trained still under the maximum likelihood principle by the Baum-Welch algorithm but with the GMMs level trained under the BYY best harmony. We proposed a new batch way EM-like Ying-Yang...

chapter

An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition

Yu Tsao, Hanwu Sun, Haizhou Li, Chin-Hui Lee

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4422 - 4425

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We propose an acoustic segment model (ASM) approach to incorporating temporal information into speaker modeling in text-independent speaker recognition. In training, the proposed framework first estimates a collection of ASM-based universal background models (UBMs). Multiple sets of speaker-specific ASMs are then obtained by adapting the ASM-based UBMs with speaker-specific enrollment data. A novel...

chapter

The IBM 2008 GALE Arabic speech transcription system

George Saon, Hagen Soltau, Upendra Chaudhari, Stephen Chu, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4378 - 4381

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper describes the Arabic broadcast transcription system fielded by IBM in the GALE Phase 3.5 machine translation evaluation. Key advances compared to our Phase 2.5 system include improved discriminative training, the use of Subspace Gaussian Mixture Models (SGMM), neural network acoustic features, variable frame rate decoding, training data partitioning experiments, unpruned n-gram language...

chapter

Discriminative training methods for language models using conditional entropy criteria

Jui-Ting Huang, Xiao Li, Alex Acero

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5182 - 5185

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper addresses the problem of discriminative training of language models that does not require any transcribed acoustic data. We propose to minimize the conditional entropy of word sequences given phone sequences, and present two settings in which this criterion can be applied. In an inductive learning setting, the phonetic/acoustic confusability information is given by a general phone error...

chapter

Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models

L Burget, P Schwarz, M Agarwal, P Akyazi, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4334 - 4337

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of “universal phone set” that covers multiple languages. We report experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the...

chapter

Weakly supervised learning with decision trees applied to fisheries acoustics.

R Lefort, R Fablet, J Boucher

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 2254 - 2257

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper addresses the training of classification trees for weakly labelled data. We call “weakly labelled data”, a training set such as the prior labelling information provided refers to vector that indicates the probabilities for instances to belong to each class. Classification tree typically deals with hard labelled data, in this paper a new procedure is suggested in order to train a tree from...

chapter

The research and implementation of acoustic module based Mandarin TTS

Cheng-Yu Yeh, Kuan-Lin Chen

2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP) > 1 - 4

4th International Symposium on Communications, Control and Signal Processing (ISCCSP 2010)

The primary study of this paper is focused on the acoustic module (AM) design in order to improve the performance of Mandarin TTS system. The AM is composed of the prosody generator, the spectrum generator, and the speech synthesizer. The HMM, recurrent neural network (RNN), and PSOLA algorithms are employed to build the AM. Finally, the performance analyses including the speech quality, memory requirement,...

chapter

HMM-based separation of acoustic transfer function for single-channel sound source localization

Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 2830 - 2833

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper presents a sound source (talker) localization method using only a single microphone, where a HMM (Hidden Markov Model) of clean speech is introduced to estimate the acoustic transfer function from a user's position. The new method is able to carry out this estimation without measuring impulse responses. The frame sequence of the acoustic transfer function is estimated by maximizing the...

chapter

Integrating recognition and retrieval with user feedback: A new framework for spoken term detection

Hung-yi Lee, Lin-shan Lee

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5290 - 5293

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

People usually consider recognition and retrieval as two cascaded independent modules for spoken term detection. Retrieval techniques were assumed to be applied on top of some ASR output, with performance depending on ASR accuracy. In this paper, we propose a new framework: to integrate the two parts into a single task. This can be achieved by adjusting the acoustic model parameters, borrowing the...

chapter

Battlefield Target Identification Based on Improved Grid-Search SVM Classifier

Jinghua Li, Congying Zhang, Zhenning Li

2009 International Conference on Computational Intelligence and Software Engineering > 1 - 4

2009 International Conference on Computational Intelligence and Software Engineering

Choosing the kernel and error penalty parameters for support vector machine (SVM) is very important for the performance of classifiers. An improved grid-search algorithm is proposed to choose the optimal parameters of SVM. The battlefield multi-target SVM classifier is designed using this algorithm. Also three classifiers including k-nearest neighborhood classifier, improved BP neural network classifier...

chapter

The Research of Vehicle Classification Using SVM and KNN in a Ramp

Zhang Changjun, Chen Yuzong

2009 International Forum on Computer Science-Technology and Applications > 3 > 391 - 394

2009 International Forum on Computer Science-Technology and Applications (IFCSTA 2009)

There is an important significance of the application for real-time classification by using of the acoustic and seismic signals generated by vehicles in the road ramp. The eight test points were put on the both sides of a road ramp, the some devices of acoustic and seismic sensors etc were put in each point. On the acquisition of acoustic and seismic signals, short-time Fourier transform (STFT) was...

chapter

An Effective CALL System for Strongly Accented Mandarin Speech

Tonghai Jiang, Ming Tang, Fengpei Ge, Changliang Liu, more

2009 International Conference on Research Challenges in Computer Science > 92 - 95

2009 International Conference on Research Challenges in Computer Science (ICRCCS 2009)

In this paper, we investigate some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model and feature under the speech recognition framework. At first, in order to alleviate the distortion of channel and speaker, speaker-dependent Cepstrum Mean Normalization (Speaker CMN) is adopted, by which the average correlation coefficient (ACC) between...

chapter

Single Sensor Acoustic Feature Extraction for Embedded Realtime Vehicle Classification

A. Starzacher, B. Rinner

2009 International Conference on Parallel and Distributed Computing, Applications and Technologies > 378 - 383

2009 International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2009)

Vehicle classification is an important task for various traffic monitoring applications. This paper investigates the capabilities of acoustic feature generation for vehicle classification. Six temporal and spectral features are extracted from the audio recordings. Six different classification algorithms are compared using the extracted features. We focus on a single sensor setting to keep the computational...

chapter

Improved Syllable Based Acoustic Modeling by Inter-Syllable Transition Model for Continuous Chinese Speech Recognition

Hao Chao, Wenju Liu

2009 Chinese Conference on Pattern Recognition > 1 - 4

2009 Chinese Conference on Pattern Recognition. (CCPR 2009) and the First CJK Joint Workshop on Pattern Recognition (CJKPR)

Accurately modeling the acoustic variabilities caused by coarticulation is important in continuous speech recognition. Recent research indicates that syllable units do better in modeling intra-syllable co-articulation effect than sub-syllable units. However, most continuous Mandarin speech recognition systems use context dependent phones or initial/finals (IFs) as the basic acoustic unit because it...

chapter

Acoustic Fault Identification of Underwater Vehicles Based on NSOM-PNN

Ruipeng Luan, Kerong Ben, Lilin Cui

2009 International Conference on Artificial Intelligence and Computational Intelligence > 2 > 384 - 388

2009 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2009)

Aiming at the requirement of class incremental learning in acoustic fault identification research, a network model using a novel Self-organizing map--negative self-organizing map (NSOM) and probabilistic neural network (PNN) is proposed. The experiment of acoustic fault identification of underwater vehicle shows that the proposed network has better capability of class incremental learning than traditional...

Keywords:
TRAINING
ACOUSTICS
ACOUSTIC SIGNAL PROCESSING

Publication date

Set your own date range

Keywords

SPEECH (22)
HIDDEN MARKOV MODELS (19)
SPEECH RECOGNITION (19)
FEATURE EXTRACTION (11)
LEARNING (ARTIFICIAL INTELLIGENCE) (11)
ACCURACY (9)
ARTIFICIAL NEURAL NETWORKS (8)
SUPPORT VECTOR MACHINES (8)
HIDDEN MARKOV MODEL (7)
NATURAL LANGUAGE PROCESSING (7)
SPEECH PROCESSING (7)
DATA MODELS (5)
NEURAL NETS (5)
AQUACULTURE (4)
BACKPROPAGATION (4)
CEPSTRAL ANALYSIS (4)
CLASSIFICATION ALGORITHMS (4)
GAUSSIAN PROCESSES (4)
MACHINE LEARNING (4)
MAXIMUM LIKELIHOOD ESTIMATION (4)
NOISE (4)
PATTERN RECOGNITION (4)
PROBABILITY (4)
SUPPORT VECTOR MACHINE (4)
ACOUSTIC MODEL (3)
ACOUSTIC SEGMENT MODEL (3)
BAYES METHODS (3)
COMPUTATIONAL COMPLEXITY (3)
COMPUTATIONAL MODELING (3)
DECODING (3)
EDUCATIONAL INSTITUTIONS (3)
ERROR STATISTICS (3)
FEATURE SELECTION (3)
FISHERIES ACOUSTICS (3)
INFORMATION RETRIEVAL (3)
MEL FREQUENCY CEPSTRAL COEFFICIENT (3)
PATTERN CLASSIFICATION (3)
PEDIATRICS (3)
ROOM IMPULSE RESPONSE (3)
SIGNAL CLASSIFICATION (3)
SPEAKER RECOGNITION (3)
TARGET TRACKING (3)
TESTING (3)
TRAINING DATA (3)
VOCABULARY (3)
ACOUSTIC FEATURES (2)
ACOUSTIC MODELING (2)
ACOUSTIC MONITORING (2)
ACOUSTIC SIGNAL (2)
ACTIVE LEARNING (2)
ADAPTATION MODEL (2)
ADAPTIVE LEARNING RATE BACK-PROPAGATION (2)
ARCHITECTURAL ACOUSTICS (2)
ARTIFICIAL NEURAL NETWORK (2)
AUTOMATED DETECTION (2)
AUTOMATIC SPEECH RECOGNITION (2)
BAYESIAN METHODS (2)
CLASSIFICATION (2)
CONTEXT (2)
CONTEXT MODELING (2)
DATA MINING (2)
DATABASES (2)
DISCRIMINATIVE TRAINING (2)
EQUAL ERROR RATE (2)
ERROR ANALYSIS (2)
ESTIMATION (2)
EVOLUTIONARY STRATEGIES (2)
GAUSSIAN DISTRIBUTION (2)
GAUSSIAN MIXTURE MODEL (2)
GENETIC FEATURE SELECTION SYSTEM (2)
HMM (2)
HUMANS (2)
HYBRID SYSTEM (2)
INDEPENDENT COMPONENT ANALYSIS (2)
INFANT CRY UNITS (2)
LATTICES (2)
MATHEMATICAL MODEL (2)
MICROPHONES (2)
NATURAL LANGUAGES (2)
OBJECT RECOGNITION (2)
PATHOLOGY (2)
PRINCIPAL COMPONENT ANALYSIS (2)
PROBABILITY DENSITY FUNCTION (2)
REVERBERATION (2)
REVERBERATION TIME (2)
RIGHT WHALE (2)
SENSORS (2)
SIGNAL PROCESSING (2)
SOLID MODELING (2)
SPEECH CODING (2)
STATISTICAL ANALYSIS (2)
SUBSPACE GAUSSIAN MIXTURE MODEL (2)
SUPERVISED LEARNING (2)
SYSTEM-ON-A-CHIP (2)
TARGET RECOGNITION (2)
TEXT ANALYSIS (2)
TRAFFIC ENGINEERING COMPUTING (2)
more

INFONA - science communication portal

Search results

Robust hands-free Automatic Speech Recognition for human-machine interaction

Novel active learning sample evaluation method based on multi-level confusion networks

Automatic speaker verification experiments using HMM

North Atlantic Right Whale acoustic signal processing: Part I. comparison of machine learning recognition algorithms

North Atlantic right whale acoustic signal processing: Part II. improved decision architecture for auto-detection using multi-classifier combination methodology

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection

An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition

The IBM 2008 GALE Arabic speech transcription system

Discriminative training methods for language models using conditional entropy criteria

Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models

Weakly supervised learning with decision trees applied to fisheries acoustics.

The research and implementation of acoustic module based Mandarin TTS

HMM-based separation of acoustic transfer function for single-channel sound source localization

Integrating recognition and retrieval with user feedback: A new framework for spoken term detection

Battlefield Target Identification Based on Improved Grid-Search SVM Classifier

The Research of Vehicle Classification Using SVM and KNN in a Ramp

An Effective CALL System for Strongly Accented Mandarin Speech

Single Sensor Acoustic Feature Extraction for Embedded Realtime Vehicle Classification

Improved Syllable Based Acoustic Modeling by Inter-Syllable Transition Model for Continuous Chinese Speech Recognition

Acoustic Fault Identification of Underwater Vehicles Based on NSOM-PNN

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options