Search results

Items from 81 to 100 out of 154 results

chapter

Comparison and combination of different CRBE based MLP features for LVCSR

Zoltan Tuske, Ralf Schluter, Hermann Ney

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4081 - 4084

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Multi Layer Perceptron (MLP) features extracted from different types of critical band energies (CRBE) — derived from MFCC, GT, and PLP pipeline — are compared on French broadcast news and conversational speech recognition task. Though the MLP structure is kept fixed, ROVER combination of different CRBE based systems leads to 4% relative improvement. Furthermore, aiming at the combination of state-of-the-art...

chapter

Vocalization patterns of dairy animals to detect animal state

Om Deshmukh, Nitendra Rajput, Yajuvendra Singh, Surender Lathwal

Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012) > 254 - 257

2012 21st International Conference on Pattern Recognition (ICPR)

Animals cannot communicate the different states of their being — such as normal, hunger, or heat state — through semantics. However, they do generate voices in different states. In this paper, we start with the hypothesis that identification of the specific state of the animal is possible by analyzing their speech signals. We use a variety of spectral features for the purpose of identifying the type...

chapter

An Improved Speaker Diarization System for Multiple Distance Microphone Meetings

Zhou Yu, Suo Hongbin, Wang Junjie, Yan Yonghong

2012 Fifth International Conference on Intelligent Computation Technology and Automation > 80 - 83

2012 Fifth International Conference on Intelligent Computation Technology and Automation (ICICTA)

This paper describes an improved speaker diarization system for multiple distance microphone (MDM) meeting conversations. First, the new system includes a modified speech activity detector (SAD). Second, it adopts the new spectral features based on equivalent rectangular bandwidth (ERB) or bark scale, which are compared with the traditional Mel Frequency Cepstral Coefficients (MFCC) features. Third,...

chapter

Modification of the speech feature extraction module for the improvement of the system for automatic lectures transcription

Josef Chaloupka, Petr Cerva, Jan Silovsky, Jindfich Zd'ansky, more

Proceedings ELMAR-2012 > 223 - 226

2012 54th International Symposium ELMAR

This contribution is about experiments with different speech feature extraction methods and strategies where the goal has been to improve the result and the resulting recognition rate of the speech recognizer of an automatic audio speech signal transcription system. The extraction of speech features is based on MFCC (Mel Frequency Cepstral Coefficients) and PLP (Perceptual Linear Prediction), which...

chapter

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Foyzul Hassan, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda

2011 World Congress on Information and Communication Technologies > 1276 - 1281

2011 World Congress on Information and Communication Technologies (WICT)

Hidden factor such as gender characteristic plays an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). If there is a suppression process that represses the decrease of differences in acoustic-likelihood among categories resulted from gender factors, a robust ASR system can be realized. In our previous paper, we proposed a technique of gender effects...

chapter

Hybrid of wavelet and MFCC features for speaker verification

Pawan Kumar, Mahesh Chandra

2011 World Congress on Information and Communication Technologies > 1150 - 1154

2011 World Congress on Information and Communication Technologies (WICT)

In this paper Wavelet Based Mel Frequency Cepstral Coefficient (WMFCC) features are proposed for speaker verification. The performance of WMFCC features is evaluated and compared with the performance of Mel Frequency Cepstral Coefficient (MFCC) features. A database of ten Hindi digits of sixteen speakers is used during simulation of results. Gaussian Mixture Models (GMMs) are used for maximum log...

chapter

Hybridization of two stage Multilayer Neural Networks based Bangla ASR incorporating dynamic parameters

Mohammed Rokibul Alam Kotwal, Md. Abdur Razzaque, Arif Hossen, Mohammad Nurul Huda

2011 11th International Conference on Hybrid Intelligent Systems (HIS) > 167 - 172

2011 11th International Conference on Hybrid Intelligent Systems (HIS 2011)

This paper presents a hybridization of Multilayer Neural Network-based Bangla phoneme recognition method for Automatic Speech Recognition (ASR) incorporating dynamic parameters. The method consists of four stages: at first stage, a multilayer neural network (MLN) converts acoustic features, mel frequency cepstral coefficients (MFCCs), into phoneme probabilities. Phoneme probabilities from the first...

chapter

Frequency-time analysis approach to feature extraction for text independent speaker identification

R. Shantha Selva Kumari, S. Selva Nidhyananthan

2011 INTERNATIONAL CONFERENCE ON RECENT ADVANCEMENTS IN ELECTRICAL, ELECTRONICS AND CONTROL ENGINEERING > 258 - 262

2011 International Conference on Recent Advancements in Electrical, Electronics and Control Engineering (ICONRAEeCE)

This paper presents an alternative approach to Mel Frequency Cepstral Coefficient (MFCC) based method of feature extraction for robust text independent speaker identification. This work is focused to increase the identification accuracy without increasing the size and complexity of filter bank. The drive for this new feature extraction technique comes from a transformation which is based on the Nyquist...

chapter

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Md. Shafiul Alam, Shakib Ibn Daud, more

2011 International Conference on Computational Intelligence and Communication Networks > 390 - 394

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

Speaker-specific characteristics play an important role on the performance of Bangla (widely used as Bengali) automatic speech recognition (ASR). It is difficult to recognize speech affected by gender factors, especially when an ASR system contains only a single acoustic model. If there exists any suppression process that represses the decrease of differences in acoustic-likelihood among categories...

chapter

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Mohammed Rokibul Alam Kotwal, Foyzul Hassan, Shakib Ibn Daud, Md. Shafiul Alam, more

2011 International Conference on Computational Intelligence and Communication Networks > 416 - 420

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

This paper presents a Neural Network-based Bangla phoneme recognition method for Automatic Speech Recognition (ASR). The method consists of three stages: at first stage, a multilayer neural network (MLN) converts acoustic features, mel frequency cepstral coefficients (MFCCs), into phoneme probabilities, where the second stage computes velocity (?) coefficients from the phoneme probabilities by using...

chapter

Comparison of voice features for Arabic speech recognition

Mansour Alsulaiman, Ghulam Muhammad, Zulfiqar Ali

2011 Sixth International Conference on Digital Information Management > 90 - 95

2011 Sixth International Conference on Digital Information Management (ICDIM)

Selection of the speech feature for speech recognition has been investigated for languages other than Arabic. Arabic Language has its own characteristics hence some speech features may be more suited for Arabic speech recognition than the others. In this paper, some feature extraction techniques are explored to find the features that will give the highest speech recognition rate. Our investigation...

chapter

Research on a kind of Noisy Tibetan speech recognition algorithm based on WNN

Yong Lu, Haining Huang

2011 Seventh International Conference on Natural Computation > 2 > 605 - 608

2011 Seventh International Conference on Natural Computation (ICNC)

The research on noisy Tibetan speech recognition algorithm based on wavelet neural network (WNN) combined with auditory feature was carried out in this paper. The recognition classifier based on WNN was designed, and Mel Frequency Cepstrum Constant (MFCC) feature was given. Then the simulation on the given algorithm was run under the different signal to noise ratios (SNR), and the results illustrated...

chapter

Transcribing deaf and hard of hearing speech using Hidden markov model

C. Jeyalakshmi, V Krishnamurthi., A. Revathy

2011 International Conference on Signal Processing, Communication, Computing and Networking Technologies > 326 - 331

2011 International Conference on Signal Processing, Communication, Computing and Networking Technologies (ICSCCN)

This paper presents the performance of the deaf speech recognition using Hidden markov model. Even persons those having perfect nasal and oral cavity cannot produce sounds if they are deaf, since they cannot hear anything. If deafness is found earlier, then using speech therapist they can be made to reproduce sounds at the maximum. Depending on the degree of hearing they are deaf, profoundly deaf...

chapter

Shout detection in noise

Jouni Pohjalainen, Paavo Alku, Tomi Kinnunen

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4968 - 4971

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

For the task of detecting shouted speech in a noisy environment, this paper introduces a system based on mel frequency cepstral coefficient (MFCC) feature extraction, unsupervised frame dropping and Gaussian mixture model (GMM) classification. The evaluation material consists of phonemically identical speech and shouting as well as environmental noise of varying levels. The performance of the shout...

chapter

Phoneme recognition using Boosted Binary Features

Anindya Roy, Mathew Magimai.-Doss, Sebastien Marcel

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4868 - 4871

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we propose a novel parts-based binary-valued feature for ASR. This feature is extracted using boosted ensembles of simple threshold-based classifiers. Each such classifier looks at a specific pair of time-frequency bins located on the spectro-temporal plane. These features termed as Boosted Binary Features (BBF) are integrated into standard HMM-based system by using multilayer perceptron...

chapter

HMM Voice Recognition Algorithm Coding

Soon Suck Jarng

2011 International Conference on Information Science and Applications > 1 - 7

2011 International Conference on Information Science and Applications (ICISA 2011)

In this paper, the voice recognition algorithm based on HMM (Hidden Markov Modeling) is analyzed in detail. The HMM voice recognition algorithm is explained and the importance of voice information DB is revealed for better improvement of voice recognition rate. The feature vector of each voice characteristic parameter is chosen by means of MFCC (Mel Frequency Cepstral Coefficients). The extracting...

chapter

Novel approach for detecting applause in continuous meeting speech

C. Manoj, S. Magesh, Aditya Sriram Sankaran, M. Sabarimalai Manikandan

2011 3rd International Conference on Electronics Computer Technology > 3 > 182 - 186

2011 3rd International Conference on Electronics Computer Technology (ICECT)

This paper proposes a robust and automated applause detection algorithm for meeting speech. The features used in the proposed algorithm are the short-time autocorrelation features such as autocorrelation energy decay factor, amplitude and lag values of first local minimum and zero-crossing points extracted from the autocorrelation sequence of a windowed audio signal. We apply decision thresholds for...

chapter

Speech Enhancement Using MMSE Estimation and Spectral Subtraction Methods

V K Gupta, A Bhowmick, M Chandra, S N Sharan

2011 International Conference on Devices and Communications (ICDeCom) > 1 - 5

2011 International Conference on Devices and Communications (ICDeCom)

Efficiency of the speech recognition system in noise free environment is impressive but in the presence of environmental noise the efficiency of the speech recognition system deteriorates drastically. Environmental noise also affects human-to-human or human-to-machine communications and degrades the speech quality as well as intelligibility. Here a speech recognition system is proposed in presence...

article

Transcribing Mandarin Broadcast Speech Using Multi-Layer Perceptron Acoustic Features

Fabio Valente, Mathew Magimai Doss, Christian Plahl, Suman Ravuri, more

IEEE Transactions on Audio, Speech, and Language Processing > 2011 > 19 > 8 > 2439 - 2450

Recently, several multi-layer perceptron (MLP)-based front-ends have been developed and used for Mandarin speech recognition, often showing significant complementary properties to conventional spectral features. Although widely used in multiple Mandarin systems, no systematic comparison of all the different approaches as well as their scalability has been proposed. The novelty of this correspondence...

chapter

Comparison of linear based feature transformations to improve speech recognition performance

Yasser Shekofteh, Farshad Almasganj, Mohammad Mohsen Goodarzi

2011 19th Iranian Conference on Electrical Engineering > 1 - 4

2011 19th Iranian Conference on Electrical Engineering (ICEE)

In automatic speech recognition system a diagonal GMM based CDHMM modeling is commonly used. So there is a need to use reasonable feature transformation to decorrelate input feature vectors to satisfy diagonal GMM assumption. In this paper, we introduce the utilization of the several supervised linear feature transformation in speech recognition tasks. Specially each of these methods has particular...

Data set:
ieee
Keywords:
FEATURE EXTRACTION
MEL FREQUENCY CEPSTRAL COEFFICIENT
SPEECH
HIDDEN MARKOV MODELS

Publication date

Set your own date range

Content availability

Available (153)
None (1)

Publication type

book (147)
article (7)

Keywords

SPEECH RECOGNITION (101)
TRAINING (38)
MFCC (35)
SPEECH PROCESSING (28)
HIDDEN MARKOV MODEL (21)
SPEAKER RECOGNITION (19)
CEPSTRAL ANALYSIS (18)
ACCURACY (17)
NOISE (17)
HMM (16)
DATABASES (15)
SUPPORT VECTOR MACHINES (14)
AUTOMATIC SPEECH RECOGNITION (13)
NATURAL LANGUAGE PROCESSING (11)
ARTIFICIAL NEURAL NETWORKS (10)
COMPUTATIONAL MODELING (10)
FILTER BANKS (10)
DATA MINING (9)
ROBUSTNESS (9)
CLASSIFICATION ALGORITHMS (8)
EMOTION RECOGNITION (7)
GAUSSIAN MIXTURE MODEL (7)
MATHEMATICAL MODEL (7)
CORRELATION (6)
GMM (6)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (6)
MULTILAYER NEURAL NETWORK (6)
NOISE MEASUREMENT (6)
SIGNAL PROCESSING (6)
SIGNAL TO NOISE RATIO (6)
VECTORS (6)
VOCABULARY (6)
ACOUSTICS (5)
ADAPTATION MODELS (5)
ALGORITHM DESIGN AND ANALYSIS (5)
AUDIO SIGNAL PROCESSING (5)
COMPUTERS (5)
DATA MODELS (5)
FREQUENCY DOMAIN ANALYSIS (5)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) (5)
SPEECH ENHANCEMENT (5)
SPEECH SYNTHESIS (5)
TRAINING DATA (5)
ACOUSTIC MODEL (4)
ACOUSTIC SIGNAL PROCESSING (4)
AUDITORY SYSTEM (4)
CLUSTERING ALGORITHMS (4)
COMPONENT (4)
DECODING (4)
DISCRETE FOURIER TRANSFORMS (4)
ELECTRONIC MAIL (4)
FILTER BANK (4)
GAUSSIAN DISTRIBUTION (4)
GAUSSIAN PROCESSES (4)
LPC (4)
MACHINE LEARNING (4)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (4)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (4)
MULTILAYER PERCEPTRONS (4)
PATTERN CLUSTERING (4)
PATTERN RECOGNITION (4)
SPEAKER DIARIZATION (4)
SPEAKER IDENTIFICATION (4)
SVM (4)
AUDIO CLASSIFICATION (3)
AUDIO SEGMENTATION (3)
AUTOMATIC SPEECH RECOGNITION SYSTEM (3)
CONFERENCES (3)
COVARIANCE MATRIX (3)
DISTINCTIVE PHONETIC FEATURES (3)
ENERGY MEASUREMENT (3)
FILTERING THEORY (3)
HIDDEN MARKOV MODEL(HMM) (3)
INDEXES (3)
JAPANESE NEWSPAPER ARTICLE SENTENCES (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
LINEAR PREDICTIVE CODING (3)
LVCSR (3)
NEURAL NETWORKS (3)
NEURONS (3)
NOISE ROBUSTNESS (3)
PATTERN CLASSIFICATION (3)
PITCH (3)
PRODUCTION (3)
SIGNAL CLASSIFICATION (3)
SPEAKER VERIFICATION (3)
SPECTRAL ANALYSIS (3)
SPEECH ANALYSIS (3)
SPEECH FEATURE EXTRACTION (3)
SPEECH SEGMENTATION (3)
TESTING (3)
VECTOR QUANTIZATION (3)
VITERBI ALGORITHM (3)
WAVELET TRANSFORMS (3)
ACCELERATION (2)
ACOUSTIC FEATURES (2)
more

INFONA - science communication portal

Search results

Comparison and combination of different CRBE based MLP features for LVCSR

Vocalization patterns of dairy animals to detect animal state

An Improved Speaker Diarization System for Multiple Distance Microphone Meetings

Modification of the speech feature extraction module for the improvement of the system for automatic lectures transcription

Bangla ASR design by suppressing gender factor with gender-independent and gender-based HMM classifiers

Hybrid of wavelet and MFCC features for speaker verification

Hybridization of two stage Multilayer Neural Networks based Bangla ASR incorporating dynamic parameters

Frequency-time analysis approach to feature extraction for text independent speaker identification

Gender Effects Suppression in Bangla ASR by Designing Multiple HMM-Based Classifiers

Hybrid Features for Neural Network-Based Bangla ASR Incorporrating Velocity Coefficients (?)

Comparison of voice features for Arabic speech recognition

Research on a kind of Noisy Tibetan speech recognition algorithm based on WNN

Transcribing deaf and hard of hearing speech using Hidden markov model

Shout detection in noise

Phoneme recognition using Boosted Binary Features

HMM Voice Recognition Algorithm Coding

Novel approach for detecting applause in continuous meeting speech

Speech Enhancement Using MMSE Estimation and Spectral Subtraction Methods

Transcribing Mandarin Broadcast Speech Using Multi-Layer Perceptron Acoustic Features

Comparison of linear based feature transformations to improve speech recognition performance

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options