Search results

Items from 1 to 20 out of 30 results

chapter

Emotion recognition using LP residual at sub-segmental, segmental and supra-segmental levels

Jainath Yadav, Anshu Kumari, K. Sreenivasa Rao

2015 International Conference on Communication, Information & Computing Technology (ICCICT) > 1 - 6

2015 International Conference on Communication, Information & Computing Technology (ICCICT)

This paper is concerned with speech signal based emotion recognition. Linear Prediction (LP) residual mainly contains source specific emotional information. LP residual is derived by inverse filtering of the speech signal. For characterizing the basic emotions, LP residual has been explored at sub-segmental level, segmental level, supra-segmental level, respectively. Gaussian mixture models (GMMs)...

chapter

Implementation and optimization of a speech recognition system based on hidden Markov model using genetic algorithm

Hassan Farsi, Reza Saleh

2014 Iranian Conference on Intelligent Systems (ICIS) > 1 - 5

2014 Iranian Conference on Intelligent Systems (ICIS)

In this paper, a speech recognition system with isolated words is implemented. Discrete hidden Markov model is used to recognize words. Feature vector consists of cepstral and delta cepstrum coefficients which are extracted from speech signal frames. Since the discrete Markov model is used, the feature vector is mapped to a discrete element by a vector quantizer. One of the problems we face in training...

chapter

Speech enhancement in noisy environment using voice activity detection and wavelet thresholding

K R Borisagar, D G Kamdar, B S Sedani, G R Kulkarni

2010 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 5

2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2010)

Voice activity detection (VAD) is an outstanding problem for speech transmission, enhancement and recognition. The variety and the varying nature of speech and background noise make it especially challenging. In the past years, many features emphasizing the differences between speech and noise have been proposed for their robustness. However an important problem in many areas of speech processing...

chapter

Spectral and textural feature-based system for automatic detection of fricatives and affricates

D Ruinskiy, N Dadush, Y Lavner

2010 IEEE 26-th Convention of Electrical and Electronics Engineers in Israel > 771 - 775

2010 IEEE 26th Convention of Electrical & Electronics Engineers in Israel (IEEEI 2010)

Phoneme spotting in continuous speech has various applications - in speech recognition, smart audio filtering, multimedia synchronization and other fields. Many studies on phoneme spotting have been conducted, using different approaches. We present two algorithms for spotting fricatives (such as /s/, /sh/, /f/) and affricates (/ts/, /ch/) - one based on a cepstrogram-matching approach, and the other...

chapter

Comparison of LDM and HMM for an Application of a Speech

V A Mane, A B Patil, K P Paradeshi

2010 International Conference on Advances in Recent Technologies in Communication and Computing > 431 - 436

2010 International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2010)

Automatic speech recognition (ASR) has moved from science-fiction fantasy to daily reality for citizens of technological societies. Some people seek it out, preferring dictating to typing, or benefiting from voice control of aids such as wheel-chairs. Others find it embedded in their hi-tec gadgetry-in mobile phones and car navigation systems, or cropping up in what would have until recently been...

chapter

Lip reading using optical flow and support vector machines

A A Shaikh, D K Kumar, W C Yau, M Z C Azemin, more

2010 3rd International Congress on Image and Signal Processing > 1 > 327 - 330

3rd International Congress on Image and Signal Processing (CISP 2010)

This paper presents a lip reading technique to classify the discrete utterances without evaluating the acoustic signals. The reported technique analysis the video data of lip motions by computing the optical flow (OF). The statistical properties of the vertical OF component were used to form the feature vectors for training the support vector machines (SVM) classifier. The impact of the variation...

chapter

The application of Local Linear Neuro Fuzzy model in recognition of online Persian isolated characters

Koorosh Samimi Daryoush, Maryam Khademi, Alireza Nikookar, Aida Farahani

2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE) > 5 > V5-574 - V5-577

2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010)

In this paper, we propose an approach for recognizing online Persian isolated characters using LLNF model. Local Linear Neuro Fuzzy (LLNF) Model is a powerful approach for classification tasks. It uses divide-and-conquer strategy to partition the problem space into sub-problems and construct Local Linear Models (LLMs). In order to classify the characters, at first, we extract some generic features...

chapter

Speech recognition system based on DSP and SVM

Xiu-Qing Zhang, Shu-Wang Chen

2010 International Conference on Machine Learning and Cybernetics > 5 > 2313 - 2316

2010 International Conference on Machine Learning and Cybernetics (ICMLC 2010)

Speech recognition can achieve a simple human-computer interaction and voice control. It is widely used in industrial control, consumer electronics and many other fields. Combining with the characteristic of human physiology, the paper presents a higher-performance speech recognition system for specific people and isolated words. It realizes on a DSP (Digital Signal Processor)system by using the LPMCC...

chapter

A Novel Fuzzy Kernel Vector Quantization for Speaker Recognition with Short Utterances

Lin Lin, Chen Jian, Sun Xiaoying

2010 International Conference on Electrical and Control Engineering > 194 - 197

2010 International Conference on Electrical and Control Engineering (ICECE 2010)

When the amount of available training and testing data will be few seconds, the number of feature vectors we obtain are less which are insufficient to model and discriminate speaker well. It presented a new method for speaker recognition with short utterances. By non-linear mapping, it used the sectional set fuzzy Vector Quantization with Lp norm to form speaker's model in the high-dimensional feature...

chapter

English digits speech recognition system based on Hidden Markov Models

A A M Abushariah, T S Gunawan, O O Khalifa, M A M Abushariah

International Conference on Computer and Communication Engineering (ICCCE'10) > 1 - 5

2010 International Conference on Computer and Communication Engineering (ICCCE 2010)

This paper aims to design and implement English digits speech recognition system using Matlab (GUI). This work was based on the Hidden Markov Model (HMM), which provides a highly reliable way for recognizing speech. The system is able to recognize the speech waveform by translating the speech waveform into a set of feature vectors using Mel Frequency Cepstral Coefficients (MFCC) technique This paper...

chapter

Two-stage feature compensation of clean and telephone speech signals employing bidirectional neural network

I Esmaili, M Vali, J Kabudian

10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010) > 157 - 160

2010 10th International Conference on Information Sciences, Signal Processing and their Applications (ISSPA 2010)

In this paper, we continue our previous work on nonlinear feature compensation of distortions in clean and telephone speech recognition systems. We have shown that Bidirectional Neural Network (Bidi-NN) can compensate nonlinearly-distorted components of feature vectors. In this study, we present a new effort to improve recognition accuracy on clean and telephone speech data by employing a two-stage...

chapter

Enhancement of speech recognition using a variable-length frame overlapping method

Ing-jr Ding

2010 International Symposium on Computer, Communication, Control and Automation (3CA) > 1 > 375 - 377

2010 International Symposium on Computer, Communication, Control and Automation (3CA 2010)

This paper proposes a method of determining a variable-length frame overlap between two consecutive frames for speech recognition. Compared with the conventional fixed-length frame overlapping method, the proposed method can improve the performance of speech signal processing when performing the front-end processing procedure of speech recognition. By varying the length of frame overlaps using the...

chapter

Auditory model based modified MFCC features

S Chatterjee, W B Kleijn

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4590 - 4593

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Using spectral and spectro-temporal auditory models, we develop a computationally simple feature vector based on the design architecture of existing mel frequency cepstral coefficients (MFCCs). Along with the use of an optimized static function to compress a set of filter bank energies, we propose to use a memory-based adaptive compression function to incorporate the behavior of human auditory response...

chapter

Noise robust speech activity detection

W.H. Abdulla, Zhou Guan, Hou Chi Sou

2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 473 - 477

2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2009)

An efficient noise robust feature is presented to track the speech activity in noisy environments. Speech is modeled by one class of 16 phone-like Gaussian mixtures while noises are modeled by 15 classes of 6 mixtures each. The feature vector used is a concatenation of carefully selected coefficients from MFCC, LPCC, and their first and second derivatives. A finite state machine and energy validation...

chapter

Identification of hearing disorder by multi-band entropy cepstrum extraction from infant's cry

M.M. Jam, H. Sadjedi

2009 International Conference on Biomedical and Pharmaceutical Engineering > 1 - 5

2009 International Conference on Biomedical and Pharmaceutical Engineering (ICBPE 2009)

Infant's cry is a multimodal behavior that contains a lot of information about the infant, particularly, information about the health of the infant. In this paper a new feature in infant cry analysis is presented for recognition two groups: infants with hearing disorder and normal infants, by Mel frequency multi-band entropy cepstrum extraction from infant's cry. Signal processing stage is included...

chapter

Maximum Likelihood Discriminant Feature for Text-Independent Speaker Verification

Qingsong Liu, Beiqian Dai

2009 2nd International Congress on Image and Signal Processing > 1 - 4

2009 2nd International Congress on Image and Signal Processing (CISP)

Feature extraction is an essential first step in speaker verification applications. In addition to static features extracted from each frame of speech data, it is beneficial to use dynamic features that use information from neighboring frames. In this paper a new feature estimation method based on maximum likelihood discriminant analysis is presented. We compare it to traditional MFCC features in...

chapter

Noise robust speech recognition by combining speech enhancement in the wavelet domain and Lin-log RASTA

Yang Jie, Wang Zhenli

2009 ISECS International Colloquium on Computing, Communication, Control, and Management > 2 > 415 - 418

2009 ISECS International Colloquium on Computing, Communication, Control, and Management (CCCM)

For improving noise robustness of speech recognition under adverse noise environment, a method of noise robust speech recognition, which combines discrete wavelet transform (DWT), wavelet packet decomposition (WPD) and Lin-log RASTA, is researched in this paper. After one scale of DWT was employed for noisy speech, this method used three scales of DWT and three scales of WPD for the low frequency...

chapter

Multi-level Speech Emotion Recognition Based on HMM and ANN

Xia Mao, Lijiang Chen, Liqin Fu

2009 WRI World Congress on Computer Science and Information Engineering > 7 > 225 - 229

2009 WRI World Congress on Computer Science and Information Engineering, CSIE

This paper proposes a new approach for emotion recognition based on a hybrid of hidden Markov models (HMMs) and artificial neural network (ANN), using both utterance and segment level information from speech. To combine the advantage on capability to dynamic time warping of HMMs and pattern recognition of ANN, the utterance is viewed as a series of voiced segments, and feature vectors extracted from...

chapter

Real-time end point detection specialized for acceleration signal

Jong Gwan Lim, Sang-Youn Kim, Dong-Soo Kwon

2009 ICCAS-SICE > 5331 - 5335

2009 ICROS-SICE International Joint Conference. ICCAS-SICE 2009

Due to temporal and spectral difference between speech and acceleration signal, the conventional end point detection (EPD) in automatic speech recognition cannot be directly applied to acceleration and threshold-based algorithms found in literatures are too heuristic to be accepted for automatic EPD. In this regard, for motion detection by acceleration, supervised learning in pattern recognition is...

chapter

Subword Latent Semantic Analysis for Texttiling-Based Automatic Story Segmentation of Chinese Broadcast News

Yulian Yang, Lei Xie

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

This paper proposes to perform latent semantic analysis (LSA) on character/syllable n-gram sequences of automatic speech recognition (ASR) transcripts, namely subword LSA, as an extension of our previous work on subword text tiling for automatic story segmentation of Chinese broadcast news. LSA represents the 'meaning' of a lexical term by a feature vector conveying the term's relations with other...

Keywords:
SPEECH RECOGNITION

Publication date

Set your own date range

Publication type

book (29)
article (1)

Keywords

FEATURE EXTRACTION (16)
SPEECH (16)
MEL FREQUENCY CEPSTRAL COEFFICIENT (8)
HIDDEN MARKOV MODELS (7)
CEPSTRAL ANALYSIS (6)
SPEECH PROCESSING (6)
AUTOMATIC SPEECH RECOGNITION (5)
HIDDEN MARKOV MODEL (5)
NEURAL NETS (5)
TRAINING (5)
MULTILAYER PERCEPTRONS (4)
NOISE MEASUREMENT (4)
ARTIFICIAL NEURAL NETWORKS (3)
ASR (3)
EMOTION RECOGNITION (3)
MULTILAYER PERCEPTRON (3)
NATURAL LANGUAGE PROCESSING (3)
NOISE (3)
SIGNAL CLASSIFICATION (3)
SPEAKER RECOGNITION (3)
SPEECH ENHANCEMENT (3)
ADAPTATION MODEL (2)
ALGORITHM DESIGN AND ANALYSIS (2)
ANN (2)
ARTIFICIAL NEURAL NETWORK (2)
CLASSIFICATION TASK (2)
DATABASES (2)
DISCRETE COSINE TRANSFORM (2)
DISCRETE COSINE TRANSFORMS (2)
DISCRETE WAVELET TRANSFORM (2)
GAUSSIAN PROCESSES (2)
HEARING (2)
LINEAR PREDICTION COEFFICIENT (2)
MAXIMUM LIKELIHOOD ESTIMATION (2)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (2)
MFCC (2)
MLP (2)
NATURAL LANGUAGES (2)
NEURAL NETWORKS (2)
NOISY ENVIRONMENT (2)
PATTERN CLASSIFICATION (2)
PATTERN RECOGNITION (2)
ROBUST SPEECH RECOGNITION (2)
ROBUSTNESS (2)
SELF-ORGANISING FEATURE MAPS (2)
SIGNAL PROCESSING (2)
SUPPORT VECTOR MACHINE (2)
SUPPORT VECTOR MACHINES (2)
VECTOR QUANTIZATION (2)
VECTORS (2)
VOICE ACTIVITY DETECTION (2)
WHITE NOISE (2)
2005 NIST LANGUAGE RECOGNITION EVALUATION (1)
ACCELERATION (1)
ACCELERATION SIGNAL (1)
ACCELEROMETERS (1)
ACOUSTIC DISTORTION (1)
ACOUSTIC MODELING (1)
ACOUSTIC SIGNAL DETECTION (1)
ACOUSTIC SIGNAL PROCESSING (1)
ACOUSTIC VECTOR SEQUENCES (1)
ADDITIVE NOISE (1)
AFFRICATES (1)
ARABIC BROADCAST NEWS (1)
ARTICULATION (1)
ASR PERFORMANCE (1)
AUDIO SIGNAL (1)
AUDIO SIGNAL PROCESSING (1)
AUDITORY MODEL (1)
AUDITORY SYSTEM (1)
AUTOCORRELATION (1)
AUTOMATIC DETECTION (1)
AUTOMATIC SPEECH RECOGNITION TRANSCRIPTS (1)
AUTOMATIC STORY SEGMENTATION (1)
AUTOREGRESSIVE STATE EVOLUTION (1)
BACKGROUND MUSIC (1)
BACKGROUND NOISE (1)
BASELINE SYSTEM (1)
BAUM-WELCH ALGORITHM (1)
BAYES CLASSIFIER (1)
BIDIRECTIONAL NEURAL NETWORK (1)
BIDIRECTIONAL NEURAL NETWORK (BIDI-NN) (1)
BIOMEDICAL MEASUREMENT (1)
BOTTLE-NECK (1)
BOTTLE-NECK FEATURE EXTRACTION (1)
BURST SPECTRUM (1)
CAR NAVIGATION SYSTEM (1)
CEPSTRAL COEFFICIENTS (1)
CEPSTRAL FEATURES (1)
CEPSTRAL MEAN SUBTRACTION (1)
CEPSTRAL MEAN SUBTRACTION (CMS) (1)
CEPSTROGRAM MATCHING (1)
CHARACTER RECOGNITION (1)
CHARACTER/SYLLABLE N-GRAM SEQUENCES (1)
CHINESE BROADCAST NEWS (1)
CLASSIFICATION ALGORITHMS (1)
CLEAN SPEECH SIGNAL (1)
COLORED NOISES (1)
more

INFONA - science communication portal

Search results

Emotion recognition using LP residual at sub-segmental, segmental and supra-segmental levels

Implementation and optimization of a speech recognition system based on hidden Markov model using genetic algorithm

Speech enhancement in noisy environment using voice activity detection and wavelet thresholding

Spectral and textural feature-based system for automatic detection of fricatives and affricates

Comparison of LDM and HMM for an Application of a Speech

Lip reading using optical flow and support vector machines

The application of Local Linear Neuro Fuzzy model in recognition of online Persian isolated characters

Speech recognition system based on DSP and SVM

A Novel Fuzzy Kernel Vector Quantization for Speaker Recognition with Short Utterances

English digits speech recognition system based on Hidden Markov Models

Two-stage feature compensation of clean and telephone speech signals employing bidirectional neural network

Enhancement of speech recognition using a variable-length frame overlapping method

Auditory model based modified MFCC features

Noise robust speech activity detection

Identification of hearing disorder by multi-band entropy cepstrum extraction from infant's cry

Maximum Likelihood Discriminant Feature for Text-Independent Speaker Verification

Noise robust speech recognition by combining speech enhancement in the wavelet domain and Lin-log RASTA

Multi-level Speech Emotion Recognition Based on HMM and ANN

Real-time end point detection specialized for acceleration signal

Subword Latent Semantic Analysis for Texttiling-Based Automatic Story Segmentation of Chinese Broadcast News

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options