Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 66 results

chapter

On statistical machine translation method for lexicon refinement in speech recognition

Haihua Xu, Xiong Xiao, Eng-Siong Chng, Haizhou Li

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 25 - 29

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In low resource Automatic Speech Recognition (ASR), one usually resorts to the Statistical Machine Translation (SMT) technique to learn transform rules to refine grapheme lexicon. To do this, we face two challenges. One is to generate grapheme sequences from the training data as the targets, which is paired with the original transcripts to train SMT models; the other is to effectively prune the learned...

chapter

Evaluation of wains as a classifier for automatic speech recognition

Rosemary T. Salaja, Ronan Flynn, Michael Russell

2015 26th Irish Signals and Systems Conference (ISSC) > 1 - 6

2015 26th Irish Signals and Systems Conference (ISSC)

This paper introduces a new back-end classifier for a speech recognition system that is based on artificial life (ALife). The ALife species being used for classification purposes are called wains, which were developed using the Créatúr framework. The speech recognition task used in the evaluation of the new classifier is that of isolated digit recognition. Performance of the proposed back-end classifier...

chapter

Cepstral noise subtraction for robust automatic speech recognition

Robert Rehr, Timo Gerkmann

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 375 - 378

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The robustness of speech recognizers towards noise can be increased by normalizing the statistical moments of the Mel-frequency cepstral coefficients (MFCCs), e. g. by using cepstral mean normalization (CMN) or cepstral mean and variance normalization (CMVN). The necessary statistics are estimated over a long time window and often, a complete utterance is chosen. Consequently, changes in the background...

chapter

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks

Atsunori Ogawa, Takaaki Hori

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4370 - 4374

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection...

chapter

On the impact of sentence length on recognition accuracy

Tomas Valenta, Lubos Smidl

2014 12th International Conference on Signal Processing (ICSP) > 500 - 504

2014 12th International Conference on Signal Processing (ICSP 2014)

The goal of this article is to analyse how the length of utterances affects performance of an automatic speech recognizer (ASR). Benchmarks of an ASR system were performed for utterances of various lengths on English and Czech corpora. Then the observed phenomena are tried to be explained theoretically. Eventually, results are summarized and some conclusions drawn.

chapter

Evaluating vad for automatic speech recognition

Sibo Tong, Nanxin Chen, Yanmin Qian, Kai Yu

2014 12th International Conference on Signal Processing (ICSP) > 2308 - 2314

2014 12th International Conference on Signal Processing (ICSP 2014)

Voice activity detection (VAD) plays a crucial role in speech processing, especially in automatic speech recognition (ASR). It identifies the boundaries of the speech to be recognized and the boundary accuracies may significantly affect the recognition performance. Conventional VAD evaluation criteria are mostly based on frame-level accuracy of speech/non-speech classification, which may result in...

chapter

A novel classifier modification approach to missing data problem for noisy speech recognition

Kian Ebrahim Kafoori, Seyed Mohammad Ahadi

7'th International Symposium on Telecommunications (IST'2014) > 458 - 463

2014 7th International Symposium on Telecommunications (IST)

Missing data theory has recently been used as a solution to noise robustness issue in Automatic Speech Recognition (ASR). Missing components of spectrogram can either be reconstructed, as carried out in Spectral Imputation, or simply ignored, as done in classifier modification. Most of the research has been focused on imputation because of the problems associated with classifier modification approaches...

chapter

Improving recognition of syallabic units of Hindi languagae using combined features of Throat Microphone and Normal Microphone speech

N. Radha, A. Shahina, G. Vinoth, A. Nayeemulla Khan

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) > 1343 - 1348

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT)

The performance of Automatic Speech recognition system (ASR) built using close talk microphones degrades in noisy environments. AS R built using Throat Microphone (TM) speech shows relatively better performance under such adverse situations. However, some of the sounds are not well captured in TM. In this work we explore the combined use of Normal Microphone (NM) and TM features to improve the recognition...

chapter

Automatic evaluation of hypernasality and speech intelligibility for children with cleft palate

Ling He, Jing Zhang, Qi Liu, Heng Yin, more

2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA) > 220 - 223

2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA)

The speech of cleft palate (CP) patients has typical characteristics. Hypernasality and low speech intelligibility are the primary characteristics for CP speech. In this work, an automatic evaluation of different levels of hypernasality and speech intelligibility algorithm for CP speech was proposed, in order to provide an objective tool for speech therapist. To identify different levels of hypernasality,...

chapter

Connected-digits recognition for an under-resourced language using Hidden Markov Models

Mabu Johannes Manaileng, Madimetja Jonas Manamela

Proceedings ELMAR-2013 > 211 - 214

2013 55th International Symposium ELMAR

This paper presents the development of a speech recognition system for automatically recognizing fluently spoken digit strings in Northern Sotho. The digit strings can be isolated or connected/continuous with known or unknown length. The digit recognition system has been trained with the aim of satisfying its potential end-users. Our main research focus was to enhance the robustness of a connected-digits...

article

A Voice-Input Voice-Output Communication Aid for People With Severe Speech Impairment

Mark S. Hawley, Stuart P. Cunningham, Phil D. Green, Pam Enderby, more

IEEE Transactions on Neural Systems and Rehabilitation Engineering > 2013 > 21 > 1 > 23 - 31

A new form of augmentative and alternative communication (AAC) device for people with severe speech impairment—the voice-input voice-output communication aid (VIVOCA)—is described. The VIVOCA recognizes the disordered speech of the user and builds messages, which are converted into synthetic speech. System development was carried out employing user-centered design and development methods, which identified...

chapter

The CU-MFEC corpus for Thai and english spelling speech recognition

Natthawut Kertkeidkachorn, Supadaech Chanjaradwichai, Teera Suri, Krerksak Likitsupin, more

2012 International Conference on Speech Database and Assessments > 18 - 23

2012 Oriental COCOSDA 2012 - International Conference on Speech Database and Assessments

Much of the efficiency of any Automatic Speech Recognition (ASR) system depends on its speech corpus. This is even more so for recognizers designed for specific tasks. Naturally, an ASR for spelling recognition performs better if it is trained with a spelling speech corpus rather than a generic one. Although several speech corpora are available in Thai, we are still lack of Thai spelling speech corpora...

chapter

On the application of reverberation suppression to robust speech recognition

Roland Maas, Emanuel A.P. Habets, Armin Sehr, Walter Kellermann

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 297 - 300

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we study the effect of the design parameters of a single-channel reverberation suppression algorithm on reverberation-robust speech recognition. At the same time, reverberation compensation at the speech recognizer is investigated. The analysis reveals that it is highly beneficial to attenuate only the reverberation tail after approximately 50 ms while coping with the early reflections...

chapter

Easy does it: Robust spectro-temporal many-stream ASR without fine tuning streams

Suman V. Ravuri, Nelson Morgan

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4309 - 4312

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Previous work has shown that spectro-temporal features reduce the word error rate for automatic speech recognition under noisy conditions. These systems, however, required significant hand-tuning in order to determine which spectral and temporal modulations should be included in a particular stream. In this work, streams are split into one spectral and temporal modulation each and their posterior...

chapter

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Foyzul Hassan, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda

2011 World Congress on Information and Communication Technologies > 1276 - 1281

2011 World Congress on Information and Communication Technologies (WICT)

Hidden factor such as gender characteristic plays an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). If there is a suppression process that represses the decrease of differences in acoustic-likelihood among categories resulted from gender factors, a robust ASR system can be realized. In our previous paper, we proposed a technique of gender effects...

chapter

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Md. Shafiul Alam, Shakib Ibn Daud, more

2011 International Conference on Computational Intelligence and Communication Networks > 390 - 394

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

Speaker-specific characteristics play an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). It is difficult to recognize speech affected by gender factors, especially when an ASR system contains only a single acoustic model. If there exists any suppression process that represses the decrease of differences in acoustic-likelihood among categories...

chapter

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Shakib Ibn Daud, Md. Shafiul Alam, more

2011 International Conference on Computational Intelligence and Communication Networks > 416 - 420

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

This paper presents a Neural Network-based Bangla phoneme recognition method for Automatic Speech Recognition (ASR). The method consists of three stages: at first stage, a multilayer neural network (MLN) converts acoustic features, mel frequency cepstral coefficients (MFCCs), into phoneme probabilities, where the second stage computes velocity (?) coefficients from the phoneme probabilities by using...

chapter

Integrating articulatory features using Kullback-Leibler divergence based acoustic model for phoneme recognition

Ramya Rasipuram, Magimai.-Doss Mathew

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5192 - 5195

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we propose a novel framework to integrate articulatory features (AFs) into HMM- based ASR system. This is achieved by using posterior probabilities of different AFs (estimated by multilayer perceptrons) directly as observation features in Kullback-Leibler divergence based HMM (KL-HMM) system. On the TIMIT phoneme recognition task, the proposed framework yields a phoneme recognition...

chapter

Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR

Oriol Vinyals, Suman V. Ravuri

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4596 - 4599

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we extend the work done on integrating multilayer perceptron (MLP) networks with HMM systems via the Tandem approach. In particular, we explore whether the use of Deep Belief Networks (DBN) adds any substantial gain over MLPs on the Aurora2 speech recognition task under mismatched noise conditions. Our findings suggest that DBNs outperform single layer MLPs under the clean condition,...

chapter

Acoustic features for detection of aspirated stops

V Patil, P Rao

2011 National Conference on Communications (NCC) > 1 - 5

2011 National Conference on Communications (NCC)

Aspiration is an important phonemic feature in several Indian languages. Unlike English, languages such as Marathi have lexicons in which words with different meanings differ only in the aspiration feature of the initial voiced or unvoiced stop. Thus the reliable discrimination of aspirated stops from their unaspirated counterparts is important in automatic speech recognition for such languages. The...

Keywords:
ACCURACY
SPEECH RECOGNITION
AUTOMATIC SPEECH RECOGNITION

Publication date

Set your own date range

INFONA - science communication portal

Advanced search

Advanced search in people

On statistical machine translation method for lexicon refinement in speech recognition

Evaluation of wains as a classifier for automatic speech recognition

Cepstral noise subtraction for robust automatic speech recognition

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks

On the impact of sentence length on recognition accuracy

Evaluating vad for automatic speech recognition

A novel classifier modification approach to missing data problem for noisy speech recognition

Improving recognition of syallabic units of Hindi languagae using combined features of Throat Microphone and Normal Microphone speech

Automatic evaluation of hypernasality and speech intelligibility for children with cleft palate

Connected-digits recognition for an under-resourced language using Hidden Markov Models

A Voice-Input Voice-Output Communication Aid for People With Severe Speech Impairment

The CU-MFEC corpus for Thai and english spelling speech recognition

On the application of reverberation suppression to robust speech recognition

Easy does it: Robust spectro-temporal many-stream ASR without fine tuning streams

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Integrating articulatory features using Kullback-Leibler divergence based acoustic model for phoneme recognition

Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR

Acoustic features for detection of aspirated stops

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options