Search results for: Anil Kumar

Items from 1 to 7 out of 7 results

chapter

Class specific GMM based sparse feature for speech units classification

Pulkit Sharma, Vinayak Abrol, A. D. Dileep, Anil Kumar Sao

2017 25th European Signal Processing Conference (EUSIPCO) > 528 - 532

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, features based on the sparse representation (SR) are proposed for the classification of speech units. The proposed method employs multiple dictionaries to effectively model variations present in the speech signal. Here, a Gaussian mixture model (GMM) is built using spectral features corresponding to frames of all the examples of a speech class. Multiple dictionaries corresponding to...

chapter

Text dependent voice recognition system using MFCC and VQ for security applications

Ashwin Nair Anil Kumar, Senthil Arumugam Muthukumaraswamy

2017 International conference of Electronics, Communication and Aerospace Technology (ICECA) > 2 > 130 - 136

2017 International conference of Electronics, Communication and Aerospace Technology (ICECA)

This paper presents the implementation of a practical voice recognition system using MATLAB (R2014b) to secure a given user's system so that only the user may access it. Voice recognition systems have two phases, training and testing. During the training phase, the characteristic features of the speaker are extracted from the speech signal and stored in a database. In the testing phase, the stored...

chapter

Group delay functions for speaker diarization

Mohit Yadav, Anil Kumar Sao, A D Dileep, Padmanabhan Rajan

2016 Twenty Second National Conference on Communication (NCC) > 1 - 5

2016 Twenty Second National Conference on Communication (NCC)

Speaker diarization is the task of determining “who spoke when” in a speech recording of an unknown duration containing an unknown number of speakers. The very unsupervised nature of this task makes it more challenging and demands that the feature representation used be highly discriminative across speakers. Commonly used features based on the short time Fourier transform are usually derived from...

chapter

Detection of emotionally significant regions of speech for emotion recognition

Hari Krishna Vydana, Peddakota Vikash, Tallam Vamsi, Kolla Pavan Kumar, more

2015 Annual IEEE India Conference (INDICON) > 1 - 6

2015 Annual IEEE India Conference (INDICON)

Emotions in human speech are short lived. In an emotive utterance, the emotive gestures produced due to the emotive state of the speaker persists only to a shorter duration. In this study, the regions of an utterance that are highly influenced by the emotive state of the speaker are detected. These regions are labeled as emotionally significant regions. Data from the detected emotionally significant...

chapter

Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection

Sri Harsha Dumpala, K N R K Raju Alluri, Suryakanth V. Gangashetty, Anil Kumar Vuppala

2015 Annual IEEE India Conference (INDICON) > 1 - 6

2015 Annual IEEE India Conference (INDICON)

Query-by-example spoken term detection (QbE-STD) refers to the task of determining the subsequence of a reference which matches with a query, where both the query and the reference are in audio format. Dynamic time warping (DTW) based techniques are explored to match the two sequences with different lengths in an unsupervised manner. In this paper, a completely unsupervised approach based on Segmental...

chapter

Various front end tools for digital speech processing

Shiva Prasad, Anil Kumar, Manjunatha, Kodanda Ramaiah

2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) > 905 - 911

2015 2nd International Conference on "Computing for Sustainable Global Development" (INDIACom)

Speech is an informative signal, which conveys many information's like status of the speaker, environmental conditions of the speaker: the other necessary parameters which are classified as prosodic features and general features of speech. As speech is a signal which can be analysed by subjecting and can be inspected to various criteria with the implication of several available techniques. In this...

chapter

IITKGP-MLILSC speech database for language identification

Sudhamay Maity, Anil Kumar Vuppala, K. Sreenivasa Rao, Dipanjan Nandi

2012 National Conference on Communications (NCC) > 1 - 5

2012 National Conference on Communications (NCC)

In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages...

Filter options

Keywords:
MEL FREQUENCY CEPSTRAL COEFFICIENT
Publication type:
book

Publication date

Set your own date range

Keywords

SPEECH (6)
SPEECH RECOGNITION (4)
DATABASES (3)
FEATURE EXTRACTION (3)
FILTER BANKS (2)
CEPSTRUM (1)
COMPUTATIONAL MODELING (1)
CONVOLUTION (1)
DELAYS (1)
DICTIONARIES (1)
DICTIONARY LEARNING (1)
DISCRETE FOURIER TRANSFORMS (1)
DYNAMIC TIME WARPING (1)
ELECTRONIC MAIL (1)
EMOTION RECOGNITION (1)
EMOTIONALLY SIGNIFICANT REGIONS (1)
ERBIUM (1)
FOURIER TRANSFORM (1)
GAUSSIAN MIXTURE MODELLING (1)
GAUSSIAN MIXTURE MODELS (GMMS) (1)
GAUSSIAN POSTERIORGRAMS (1)
HIDDEN MARKOV MODELS (1)
INDIAN LANGUAGE DATABASE (1)
INFORMATION TECHNOLOGY (1)
ITAKURA PARALLELOGRAM (1)
LANGUAGE IDENTIFICATION (1)
LINEAR PREDICTION CEPSTRAL COEFFICIENTS (LPCCS) (1)
MACHINE LEARNING (1)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCCS) (1)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (MFCCS) (1)
PHYSIOLOGICAL CONSTRAINTS (1)
PREDICTIVE MODELS (1)
QUERY-BY-EXAMPLE SPOKEN TERM DETECTION (1)
SAKOE-CHIBA BAND (1)
SINGLE PHOTON EMISSION COMPUTED TOMOGRAPHY (1)
SPARSE REPRESENTATION (1)
SPEAKER IDENTIFICATION (1)
SPECTRAL SUBTRACTION (1)
SPEECH PROCESSING (1)
SPEECH PRODUCTION SYSTEM (1)
STFT (1)
TRAINING (1)
VECTOR QUANTIZATION (VQ) (1)
VOICE RECOGNITION (1)
more

INFONA - science communication portal

Search results for: Anil Kumar

Class specific GMM based sparse feature for speech units classification

Text dependent voice recognition system using MFCC and VQ for security applications

Group delay functions for speaker diarization

Detection of emotionally significant regions of speech for emotion recognition

Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection

Various front end tools for digital speech processing

IITKGP-MLILSC speech database for language identification

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options