Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 30 results

chapter

Two-stage phone recognition system using articulatory and spectral features

K E Manjunath, K. Sreenivasa Rao, M Gurunath Reddy

2015 International Conference on Signal Processing and Communication Engineering Systems > 107 - 111

2015 International Conference on Signal Processing And Communication Engineering Systems (SPACES)

In this paper, we propose a two-stage phone recognition system using articulatory and spectral features. In the first stage, articulatory features are predicted from spectral features using FeedForward Neural Networks (FFNNs). In the second stage, phone recognition is carried out using the predicted articulatory features and spectral features together. FFNNs and Hidden Markov Models are explored for...

chapter

Improving recognition of syallabic units of Hindi languagae using combined features of Throat Microphone and Normal Microphone speech

N. Radha, A. Shahina, G. Vinoth, A. Nayeemulla Khan

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) > 1343 - 1348

2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT)

The performance of Automatic Speech recognition system (ASR) built using close talk microphones degrades in noisy environments. AS R built using Throat Microphone (TM) speech shows relatively better performance under such adverse situations. However, some of the sounds are not well captured in TM. In this work we explore the combined use of Normal Microphone (NM) and TM features to improve the recognition...

chapter

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Foyzul Hassan, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda

2011 World Congress on Information and Communication Technologies > 1276 - 1281

2011 World Congress on Information and Communication Technologies (WICT)

Hidden factor such as gender characteristic plays an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). If there is a suppression process that represses the decrease of differences in acoustic-likelihood among categories resulted from gender factors, a robust ASR system can be realized. In our previous paper, we proposed a technique of gender effects...

chapter

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Md. Shafiul Alam, Shakib Ibn Daud, more

2011 International Conference on Computational Intelligence and Communication Networks > 390 - 394

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

Speaker-specific characteristics play an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). It is difficult to recognize speech affected by gender factors, especially when an ASR system contains only a single acoustic model. If there exists any suppression process that represses the decrease of differences in acoustic-likelihood among categories...

chapter

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Shakib Ibn Daud, Md. Shafiul Alam, more

2011 International Conference on Computational Intelligence and Communication Networks > 416 - 420

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

This paper presents a Neural Network-based Bangla phoneme recognition method for Automatic Speech Recognition (ASR). The method consists of three stages: at first stage, a multilayer neural network (MLN) converts acoustic features, mel frequency cepstral coefficients (MFCCs), into phoneme probabilities, where the second stage computes velocity (?) coefficients from the phoneme probabilities by using...

chapter

Performance evaluation of MLPC and MFCC for HMM based noisy speech recognition

M Rahman, M B I Islam

2010 13th International Conference on Computer and Information Technology (ICCIT) > 273 - 276

13th International Conference on Computer and Information Technology (ICCIT 2010)

In this paper auditory like features MLPC and MFCC have been used as front-end and their performance has been evaluated on Aurora-2 database for Hidden Markov Model (HMM) based noisy speech recognition. The clean data set is used for training and test set A is used to examine the performance. It has been found that almost the same recognition performance has been obtained both for MLPC and MFCC and...

chapter

New robust speech recognition using DTW in noise

Zhang Yuxin, Y Miyanaga, C Siriteanu

2010 10th International Symposium on Communications and Information Technologies > 34 - 38

2010 10th International Symposium on Communications and Information Technologies (ISCIT 2010)

This paper proposes a new robust speech recognition method. Since the hidden Markov model (HMM) algorithm need a lot of training calculation, The dynamic time warping (DTW) algorithm based on median filter is used instead in our system. According to the short-term energy method, the non-speech segment can be removed. Recognition accuracy is thus improved. The cepstral mean subtraction (CMS), running...

chapter

Korean pronunciation variation modeling with probabilistic Bayesian networks

Sakriani Sakti, Andrew Finch, Ryosuke Isotani, Hisashi Kawai, more

2010 4th International Universal Communication Symposium > 52 - 57

2010 4th International Universal Communication Symposium (IUCS 2010)

In Korean language, a large proportion of word units are pronounced differently from their written forms due to an agglutinative and highly inflective nature having severe phonological phenomena and coarticulation effects. This paper reports on an ongoing study of Korean pronunciation modeling, in which the mapping between phonemic and orthographic units is modeled by a Bayesian network (BN). The...

chapter

Distinctive Phonetic Features (DPFs)-Based Isolated Word Recognition Using Multilayer Neural Networks

M N Huda, M M Hasan, S Ahmed, D F Rahman, more

2010 First International Conference on Integrated Intelligent Computing > 51 - 55

2010 First International Conference on Integrated Intelligent Computing (ICIIC 2010)

This paper describes an isolated word recognition method based on distinctive phonetic features (DPFs). The method comprises two multilayer neural networks (MLNs). The first MLN, MLNLF-DPF, maps local features (LFs) of an input speech signal into discrete DPFs and the second MLN, MLNDyn, restricts dynamics of outputted DPFs by the MLNLF-DPF. In the experiments on Tohokudai Isolated Spoken-Word Database...

chapter

Exploiting multimodal data fusion in robust speech recognition

Panikos Heracleous, Pierre Badin, Gérard Bailly, Norihiro Hagita

2010 IEEE International Conference on Multimedia and Expo > 568 - 572

2010 IEEE International Conference on Multimedia and Expo (ICME)

This article introduces automatic speech recognition based on Electro-Magnetic Articulography (EMA). Movements of the tongue, lips, and jaw are tracked by an EMA device, which are used as features to create Hidden Markov Models (HMM) and recognize speech only from articulation, that is, without any audio information. Also, automatic phoneme recognition experiments are conducted to examine the contribution...

chapter

Design of codebook using Centroid Neural Network with state dependence measure

Dong-Chul Park

ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010 > 1 - 7

2010 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA 2010)

A codebook design method for Hidden Markov Model (HMM) by using a Centroid Neural Network (CNN) is applied to a Korean monophone recognition problem in this paper. In order to alleviate the accuracy degradation problem in tied mixture HMM (TMHMM), this paper utilizes a clustering algorithm, called Centroid Neural Network with State Dependence measure (CNN(SD)), for TMHMMs. The CNN(SD) uses a novel...

chapter

Is Phoneme Level Better than Word Level for HMM Models in Limited Vocabulary ASR Systems?

Yousef Ajami Alotaibi

2010 Seventh International Conference on Information Technology: New Generations > 332 - 337

Seventh International Conference on Information Technology: New Generations (ITNG 2010)

In this paper Arabic alphadigits were investigated from the speech recognition problem point of view. Limited vocabulary Arabic Automatic Speech Recognition Systems (ASRs) were designed, implemented, and tested by using isolated word utterances which consists of Arabic alphabets and/or digits. These systems were implemented separately by using phoneme level and word level based HMM models in distinct...

chapter

CDHMM parameters selection for speaker-independent phone recognition in continuous speech system

Zaineb Ben Messaoud, Ahmed Ben Hamida

Melecon 2010 - 2010 15th IEEE Mediterranean Electrotechnical Conference > 253 - 258

MELECON 2010 - 2010 15th IEEE Mediterranean Electrotechnical Conference

Pattern recognition has long been a topic of fundamental importance in a wide range of science and technology. Over the years there have been a range of several tasks developed for speech recognition. While in recent years speech recognizer evaluation has focused on LVCSR research, we believe that evaluating recognition at the phone level is important since the words are always represented by the...

chapter

Partial sequence matching using an Unbounded Dynamic Time Warping algorithm

Xavier Anguera, Robert Macrae, Nuria Oliver

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 3582 - 3585

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Before the advent of Hidden Markov Models(HMM)-based speech recognition, many speech applications were built using pattern matching algorithms like the Dynamic Time Warping (DTW) algorithm, which are generally robust to noise and easy to implement. The standard DTW algorithm usually suffers from lack of flexibility on start-end matching points and has high computational costs. Although some DTW-based...

chapter

Speech recognition of Malayalam numbers

C. Kurian, K. Balakrishnan

2009 World Congress on Nature&Biologically Inspired Computing (NaBIC) > 1475 - 1479

2009 World Congress on Nature & Biologically Inspired Computing (NaBIC 2009)

Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and hidden Markov model (HMM) for recognition. The system is trained...

chapter

Viterbi Algorithm for multi-pattern joint decoding

N.U. Nair, T.V. Sreenivas

TENCON 2009 - 2009 IEEE Region 10 Conference > 1 - 5

TENCON 2009. 2009 IEEE Region 10 Conference

Multi pattern Viterbi algorithm (MPVA) to jointly decode and recognize multiple speech patterns for automatic speech recognition (ASR) is proposed. The MPVA is a generalization of the Viterbi algorithm (VA) to jointly decode multiple patterns for a given standard hidden Markov model (HMM). Unlike our previously proposed constrained multi pattern Viterbi algorithm (CMPVA), the MPVA does not require...

chapter

Using Hybrid HMM-Based Speech Segmentation to Improve Synthetic Speech Quality

I. Mporas, A. Lazaridis, T. Ganchev, N. Fakotakis

2009 13th Panhellenic Conference on Informatics > 118 - 122

2009 13th Panhellenic Conference on Informatics (PCI 2009)

The automatic phonetic time-alignment of speech databases is essential for the development cycle of a text-to-speech (TTS) system. Furthermore, the quality of the synthesized speech signals is strongly related to the precision of the produced alignment. In the present work we study the performance of a new HMM-based speech segmentation method. The method is based on hybrid embedded and isolated-unit...

chapter

Minimum phone error based stream weight training for mandarin audio-visual Speech recognition

Guanyong Wu, Jie Zhu, Haihua Xu

2009 IEEE International Conference on Multimedia and Expo > 902 - 905

2009 IEEE International Conference on Multimedia and Expo (ICME)

Stream weight training is one of the key issues in the bimodal integration for the audio-visual speech recognition. In this paper, the audio- and video-only HMM classifiers are combined to recognize audio-visual speech recognition. More specifically, a discriminative training method is provided, in which the state-dependent stream weights are trained based on lattice rescoring by the minimum phone...

chapter

Voice search of structured media data

Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, M. Seltzer, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 3941 - 3944

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-to-end search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on...

chapter

Sensor subset selection for surface electromyograpy based speech recognition

G. Colby, J.T. Heaton, L.D. Gilmore, J. Sroka, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 473 - 476

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

The authors previously reported speaker-dependent automatic speech recognition accuracy for isolated words using eleven surface-electromyographic (sEMG) sensors in fixed recording locations on the face and neck. The original array of sensors was chosen to ensure ample coverage of the muscle groups known to contribute to articulation during speech production. In this paper we systematically analyzed...

Keywords:
ACCURACY
SPEECH RECOGNITION
HIDDEN MARKOV MODEL

Publication date

Set your own date range

INFONA - science communication portal

Advanced search

Advanced search in people

Two-stage phone recognition system using articulatory and spectral features

Improving recognition of syallabic units of Hindi languagae using combined features of Throat Microphone and Normal Microphone speech

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Performance evaluation of MLPC and MFCC for HMM based noisy speech recognition

New robust speech recognition using DTW in noise

Korean pronunciation variation modeling with probabilistic Bayesian networks

Distinctive Phonetic Features (DPFs)-Based Isolated Word Recognition Using Multilayer Neural Networks

Exploiting multimodal data fusion in robust speech recognition

Design of codebook using Centroid Neural Network with state dependence measure

Is Phoneme Level Better than Word Level for HMM Models in Limited Vocabulary ASR Systems?

CDHMM parameters selection for speaker-independent phone recognition in continuous speech system

Partial sequence matching using an Unbounded Dynamic Time Warping algorithm

Speech recognition of Malayalam numbers

Viterbi Algorithm for multi-pattern joint decoding

Using Hybrid HMM-Based Speech Segmentation to Improve Synthetic Speech Quality

Minimum phone error based stream weight training for mandarin audio-visual Speech recognition

Voice search of structured media data

Sensor subset selection for surface electromyograpy based speech recognition

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options