Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 58 results

chapter

Hidden Markov Model system training using HTK

Faquih Aiman, Zia Saquib, Shikha Nema

2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT) > 806 - 809

2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)

In general for any speech processing, represented speech signals are pre-processed for some features at front end and some estimation are performed at back end. Hidden Markov Model is exclusively used for modeling time-varying vector sequences due to its simplicity. It also provides high accuracy in non-stationary environment. In this paper, HTK (Hidden Markov model Tool-Kit) toolkit is used for compiling...

chapter

On the use of EMD for automatic newborn cry segmentation

Lina Abou-Abbas, Leila Montazeri, Christian Gargour, Chakib Tadj

2015 International Conference on Advances in Biomedical Engineering (ICABME) > 262 - 265

2015 International Conference on Advances in Biomedical Engineering (ICABME)

Cry segmentation is an essential preprocessing step in any infant crying diagnosis system. Besides crying sounds consisting of expiration phases followed by short periods of inspiration episodes, each recording of newborn cries also includes silence sections as well as other sounds such as speech of caregivers, noise and sound of medical equipments. This paper is devoted to a newly developed Empirical...

chapter

A novel classifier modification approach to missing data problem for noisy speech recognition

Kian Ebrahim Kafoori, Seyed Mohammad Ahadi

7'th International Symposium on Telecommunications (IST'2014) > 458 - 463

2014 7th International Symposium on Telecommunications (IST)

Missing data theory has recently been used as a solution to noise robustness issue in Automatic Speech Recognition (ASR). Missing components of spectrogram can either be reconstructed, as carried out in Spectral Imputation, or simply ignored, as done in classifier modification. Most of the research has been focused on imputation because of the problems associated with classifier modification approaches...

chapter

Bangla phonetic feature table construction for automatic speech recognition

Foyzul Hassan, Mohammed Rokibul Alam Kotwal, Mohammad Nurul Huda

16th Int'l Conf. Computer and Information Technology > 51 - 55

2013 16th International Conference on Computer and Information Technology (ICCIT)

This This research constructs a phonetic feature (PF) table for all the phonemes pronounced in Bangla (widely known as Bengali) language where the whole study is divided into two parts. In the first part, a PF table is constructed, while the second part deals with Bangla automatic speech recognition (ASR) using PFs. For Bangla language, fifty three phonemes including both vowels and consonants are...

chapter

Connected-digits recognition for an under-resourced language using Hidden Markov Models

Mabu Johannes Manaileng, Madimetja Jonas Manamela

Proceedings ELMAR-2013 > 211 - 214

2013 55th International Symposium ELMAR

This paper presents the development of a speech recognition system for automatically recognizing fluently spoken digit strings in Northern Sotho. The digit strings can be isolated or connected/continuous with known or unknown length. The digit recognition system has been trained with the aim of satisfying its potential end-users. Our main research focus was to enhance the robustness of a connected-digits...

chapter

Speech Endpoint Detection Based on Improved Cepstral Mean Subtraction

Feifei Du, Qizhi Huang, Chengyuan Wei, Bo Wang

2012 Second International Conference on Intelligent System Design and Engineering Application > 1121 - 1124

2012 Second International Conference on Intelligent System Design and Engineering Application (ISDEA)

This paper presents a novel endpoint detection method based on Cepstral Mean Subtraction (CMS) for robust and accurate speech recognition in noisy environments. The improved method based on CMS applies Hidden Markov Model (HMM) to do two-step classification for better performance, using the optimal spectral feature subset extracted according to the rule of minimum conditional entropy. In addition,...

chapter

Use of fuzzy min-max neural network for speaker identification

N. P. Jawarkar, R. S. Holambe, T. K. Basu

2011 International Conference on Recent Trends in Information Technology (ICRTIT) > 178 - 182

2011 International Conference on Recent Trends in Information Technology (ICRTIT)

This paper presents the use of fuzzy min-max neural network for the text independent speaker identification. The fuzzy min-max neural network utilizes fuzzy sets as pattern classes. It is a three layer feedforward network that grows adaptively to meet the demands of the problem. The database containing speech utterances recorded from fifty speakers in Marathi language is used for experimentation....

chapter

Experiments in context-independent recognition of non-lexical ‘yes’ or ‘no’ responses

Shiva Sundaram, Robert Schleicher, Nathalie Diehl

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5696 - 5699

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We present our experiments in context-free recognition of non-lexical responses. Non-lexical verbal responses such as mmm-hmm or uh-huh are used by listeners to signal confirmation, uncertainty in understanding, agreement or disagreement in speech-based interaction between humans. Correct recognition of these utterances by speech interfaces can lead to a more natural interaction paradigm with computers...

chapter

Speech Enhancement Using MMSE Estimation and Spectral Subtraction Methods

V K Gupta, A Bhowmick, M Chandra, S N Sharan

2011 International Conference on Devices and Communications (ICDeCom) > 1 - 5

2011 International Conference on Devices and Communications (ICDeCom)

Efficiency of the speech recognition system in noise free environment is impressive but in the presence of environmental noise the efficiency of the speech recognition system deteriorates drastically. Environmental noise also affects human-to-human or human-to-machine communications and degrades the speech quality as well as intelligibility. Here a speech recognition system is proposed in presence...

article

Segmentation of Monologues in Audio Books for Building Synthetic Voices

K Prahallad, A W Black

IEEE Transactions on Audio, Speech, and Language Processing > 2011 > 19 > 5 > 1444 - 1449

One of the issues in using audio books for building a synthetic voice is the segmentation of large speech files. The use of the Viterbi algorithm to obtain phone boundaries on large audio files fails primarily because of huge memory requirements. Earlier works have attempted to resolve this problem by using large vocabulary speech recognition system employing restricted dictionary and language model...

chapter

Technology of the speaker verification under stress

R Martsyshyn, Y Rashkevych

2011 11th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM) > 438 - 439

2011 11th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM 2011)

In this paper the technology verification of the speaker under stress and built a structure of the system verification.

chapter

An efficient multi-modal biometric person authentication system using Fuzzy Logic

S Vasuhi, V Vaidehi, N T N Babu, T M Treesa

ICoAC 2010 > 74 - 81

2010 2nd International Conference on Advanced Computing (ICoAC)

This paper proposes a system obtained through decision level fusion of two well known biometric sensors to identify a person namely, Fingerprint sensor and Voice sensor. More than one sensor is needed for critical or highly secured areas. This paper proposes a multiple sensor data fusion methodology using Fuzzy Logic (FL) approach. The finger prints recognition system uses orientation of the input...

chapter

A novel method for Text-Independent speaker identification using MFCC and GMM

M S Sinith, Anoop Salim, K Gowri Sankar, K V Sandeep Narayanan, more

2010 International Conference on Audio, Language and Image Processing > 292 - 296

2010 International Conference on Audio, Language and Image Processing (ICALIP)

The area of speaker recognition is concerned with extracting the identity of the person speaking. Speaker recognition can be classified into speaker identification and speaker verification. Speaker identification can be Text-Independent or Text-Dependent. In this paper we lay emphasis on text-Independent speaker identification system where we adopted Mel-Frequency Cepstral Coefficients (MFCC) as the...

chapter

Voice Content Matching System for Quran Readers

W M Muhammad, R Muhammad, A Muhammad, A M Martinez-Enriquez

2010 Ninth Mexican International Conference on Artificial Intelligence > 148 - 153

2010 Ninth Mexican International Conference on Artificial Intelligence (MICAI 2010)

In Islamic religion, mistakes in recitation of holy Quran (the sacred book of Muslims) are forbidden. Mistakes can be missing words, verse, misreading Harakat (pronunciations, punctuations, and accents). Thus, a hafiz/reciter who memorizes the holy Quran, needs other hafiz/tutor who listens the recitation and points oral mistakes. Due to the seriously commitment, the availability and expertise of...

chapter

High-Level Feature Extraction Using SIFT GMMs and Audio Models

N Inoue, T Saito, K Shinoda, S Furui

2010 20th International Conference on Pattern Recognition > 3220 - 3223

2010 20th International Conference on Pattern Recognition (ICPR 2010)

We propose a statistical framework for high-level feature extraction that uses SIFT Gaussian mixture models (GMMs) and audio models. SIFT features were extracted from all the image frames and modeled by a GMM. In addition, we used mel-frequency cepstral coefficients and ergodic hidden Markov models to detect high-level features in audio streams. The best result obtained by using SIFT GMMs in terms...

chapter

Auditory Features Revisited for Robust Speech Recognition

F Kelly, N Harte

2010 20th International Conference on Pattern Recognition > 4456 - 4459

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Auditory based front-ends for speech recognition have been compared before, but this paper focuses on two of the most promising algorithms for noise robustness in automatic speech recognition (ASR). The feature sets are Zero-Crossings with Peak Amplitudes (ZCPA) and the recently introduced Power-Law Nonlinearity and Power-Bias Subtraction (PNCC). Standard Mel-Frequency Cepstral Coefficients (MFCC)...

chapter

Multisensor Fusion in Smartphones for Lifestyle Monitoring

Raghu Kiran Ganti, Soundararajan Srinivasan, Aca Gacic

2010 International Conference on Body Sensor Networks > 36 - 43

2010 International Conference on Body Sensor Networks (BSN)

Smartphones with diverse sensing capabilities are becoming widely available and pervasive in use. With the phone becoming a mobile personal computer, integrated applications can use multi-sensory data to derive information about the user's actions and the context in which these actions occur. This paper develops a novel method to assess daily living patterns using a smartphone equipped with microphones...

chapter

Natural speaker-independent Arabic speech recognition system based on Hidden Markov Models using Sphinx tools

Mohammad A M Abushariah, Raja N Ainon, Roziati Zainuddin, Moustafa Elshafei, more

International Conference on Computer and Communication Engineering (ICCCE'10) > 1 - 6

2010 International Conference on Computer and Communication Engineering (ICCCE 2010)

This paper reports the design, implementation, and evaluation of a research work for developing a high performance natural speaker-independent Arabic continuous speech recognition system. It aims to explore the usefulness and success of a newly developed speech corpus, which is phonetically rich and balanced, presenting a competitive approach towards the development of an Arabic ASR system as compared...

chapter

Insect Sound Recognition Based on SBC and HMM

Zhu Leqing, Zhang Zhen

2010 International Conference on Intelligent Computation Technology and Automation > 2 > 544 - 548

2010 International Conference on Intelligent Computation Technology and Automation (ICICTA 2010)

In order to help general technicians to recognize insects conveniently in pests management, this paper proposed a viable scheme to identify insect sounds automatically by using Sub-band based cepstral(SBC) and Hidden Markov Model(HMM). The acoustic signal is preprocessed, segmented into a series of sound samples. SBC is extracted from the sound sample as the feature, and HMMs are trained with given...

chapter

Selecting static and dynamic features using an advanced auditory model for speech recognition

C Koniaris, S Chatterjee, W B Kleijn

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4342 - 4345

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

We describe a method to select features for speech recognition that is based on a quantitative model of the human auditory periphery. The method maximizes the similarity of the geometry of the space spanned by the subset of features and the geometry of the space spanned by the auditory model output. The selection method uses a spectro-temporal auditory model that captures both frequency- and time-domain...

Content availability:
Available
Keywords:
FEATURE EXTRACTION
HIDDEN MARKOV MODELS
CEPSTRAL ANALYSIS

Publication date

Set your own date range

Publication type

book (56)
article (2)

Keywords

SPEECH (39)
SPEECH RECOGNITION (32)
MEL FREQUENCY CEPSTRAL COEFFICIENT (19)
TRAINING (17)
SPEECH PROCESSING (14)
MFCC (13)
SPEAKER RECOGNITION (13)
HIDDEN MARKOV MODEL (12)
HMM (11)
NOISE (11)
ACCURACY (9)
DATABASES (9)
ACOUSTIC SIGNAL PROCESSING (7)
ACOUSTICS (7)
NATURAL LANGUAGE PROCESSING (7)
ROBUSTNESS (7)
ARTIFICIAL NEURAL NETWORKS (6)
DATA MINING (6)
GAUSSIAN DISTRIBUTION (6)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (6)
SIGNAL CLASSIFICATION (6)
TESTING (6)
AUTOMATIC SPEECH RECOGNITION (5)
GAUSSIAN PROCESSES (5)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (5)
ALGORITHM DESIGN AND ANALYSIS (4)
ANALYTICAL MODELS (4)
CLASSIFICATION ALGORITHMS (4)
COMPUTERS (4)
FILTERING THEORY (4)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (4)
NOISE MEASUREMENT (4)
SIGNAL PROCESSING (4)
SPEAKER VERIFICATION (4)
ARTIFICIAL NEURAL NETWORK (3)
AUDIO SIGNAL PROCESSING (3)
BANDWIDTH (3)
CEPSTRAL COEFFICIENTS (3)
CLASSIFICATION (3)
CLUSTERING ALGORITHMS (3)
COMPUTATIONAL MODELING (3)
CONTINUOUS HIDDEN MARKOV MODEL (3)
COVARIANCE MATRIX (3)
ESTIMATION (3)
FREQUENCY DOMAIN ANALYSIS (3)
GAUSSIAN MIXTURE MODEL (3)
MATHEMATICAL MODEL (3)
NATURAL LANGUAGES (3)
NEURAL NETS (3)
NOISE ROBUSTNESS (3)
ROMANIAN LANGUAGE (3)
SIGNAL TO NOISE RATIO (3)
SPEAKER IDENTIFICATION (3)
SUPPORT VECTOR MACHINE CLASSIFICATION (3)
UNDERWATER SOUND (3)
VECTORS (3)
VITERBI ALGORITHM (3)
ATMOSPHERIC MODELING (2)
AUDITORY SYSTEM (2)
BENGALI LANGUAGE (2)
CONFERENCES (2)
CORRELATION (2)
DETECTORS (2)
DISCRETE COSINE TRANSFORMS (2)
DISCRETE FOURIER TRANSFORMS (2)
DISTORTION (2)
DISTRIBUTED SPEECH RECOGNITION (2)
ELMAN NETWORK (2)
EMOTION RECOGNITION (2)
ENGINES (2)
EQUATIONS (2)
FEATURE SELECTION (2)
FEATURE VECTORS (2)
FILTER BANK (2)
GMM (2)
HUMANS (2)
K MEANS ALGORITHM (2)
LABORATORIES (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
LINE SPECTRAL FREQUENCIES (2)
LINEAR PREDICTION COEFFICIENTS (2)
LPC (2)
MAXIMUM LIKELIHOOD DETECTION (2)
MECHANICAL ENGINEERING COMPUTING (2)
MEL FREQUENCY CEPSTRAL COEFFICIENT TECHNIQUE (2)
MEL FREQUENCY CEPSTRUM COEFFICIENT (2)
MULTILAYER PERCEPTRONS (2)
NOISY SPEECH RECOGNITION (2)
OCEANOGRAPHIC TECHNIQUES (2)
PATTERN CLASSIFICATION (2)
PATTERN MATCHING (2)
PATTERN RECOGNITION (2)
PERCEPTUAL LINEAR PREDICTION (2)
PLP (2)
POLYNOMIALS (2)
PROBABILITY (2)
ROBUST SPEECH RECOGNITION (2)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Hidden Markov Model system training using HTK

On the use of EMD for automatic newborn cry segmentation

A novel classifier modification approach to missing data problem for noisy speech recognition

Bangla phonetic feature table construction for automatic speech recognition

Connected-digits recognition for an under-resourced language using Hidden Markov Models

Speech Endpoint Detection Based on Improved Cepstral Mean Subtraction

Use of fuzzy min-max neural network for speaker identification

Experiments in context-independent recognition of non-lexical ‘yes’ or ‘no’ responses

Speech Enhancement Using MMSE Estimation and Spectral Subtraction Methods

Segmentation of Monologues in Audio Books for Building Synthetic Voices

Technology of the speaker verification under stress

An efficient multi-modal biometric person authentication system using Fuzzy Logic

A novel method for Text-Independent speaker identification using MFCC and GMM

Voice Content Matching System for Quran Readers

High-Level Feature Extraction Using SIFT GMMs and Audio Models

Auditory Features Revisited for Robust Speech Recognition

Multisensor Fusion in Smartphones for Lifestyle Monitoring

Natural speaker-independent Arabic speech recognition system based on Hidden Markov Models using Sphinx tools

Insect Sound Recognition Based on SBC and HMM

Selecting static and dynamic features using an advanced auditory model for speech recognition

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options