Search results

Items from 1 to 20 out of 654 results

chapter

Classification of multi speaker shouted speech and single speaker normal speech

Shikha Baghel, S. R. Mahadeva Prasanna, Prithwijit Guha

TENCON 2017 - 2017 IEEE Region 10 Conference > 2388 - 2392

TENCON 2017 - 2017 IEEE Region 10 Conference

This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The...

chapter

A Deep Transfer Learning Approach for Improved Post-Traumatic Stress Disorder Diagnosis

Debrup Banerjee, Kazi Islam, Gang Mei, Lemin Xiao, more

2017 IEEE International Conference on Data Mining (ICDM) > 11 - 20

2017 IEEE International Conference on Data Mining (ICDM)

Post-traumatic stress disorder (PTSD) is a traumatic-stressor related disorder developed by exposure to a traumatic or adverse environmental event that caused serious harm or injury. Structured interview is the only widely accepted clinical practice for PTSD diagnosis but suffers from several limitations including the stigma associated with the disease. Diagnosis of PTSD patients by analyzing speech...

chapter

Detecting depression in speech: Comparison and combination between different speech types

Hailiang Long, Zhenghao Guo, Xia Wu, Bin Hu, more

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) > 1052 - 1058

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Depression is a mental disorder of high prevalence, leading to a negative effect on individuals, their families, society and the economy. In recent years, the problem of automatic detection of depression from the speech signal has gained more interest. In this paper, a new multiple classifier system for depression recognition was developed and tested. The novel aspect of this methodology is the combination...

chapter

An improved approach to open set text-independent speaker identification (OSTI-SI)

ShrutiSarika Chakraborty, Ranjan Parekh

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) > 51 - 56

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)

This paper focuses on open set text independent speaker identification which is one of the most challenging subclass of Speaker recognition. The initial stage is similar to closed set speaker identification, where the distortion for each test voice against all train voices are determined. The distortions after normalization is set as decision criteria which eases the process of thresholding. The threshold...

chapter

Unsupervised speaker segmentation framework based on sparse correlation feature

Yi Xin Sun, Yong Ma, Kai Bo Shi, Jiang Ping Hu, more

2017 Chinese Automation Congress (CAC) > 3058 - 3063

2017 Chinese Automation Congress (CAC)

With the increasing stress in working and studying, mental health becomes a major problem in the current social research. Generally, researchers can analyze psychological health states by using social perception behavior. The speech signal is an important research direction in this domain. It objectively assesses the mental health of social groups through the extraction and fusion of speech features...

chapter

Comparing statistical classifiers for emotion classification

Raseeda Hamzah, Nursuriati Jamil, Khyrina Airin Fariza Abu Samah, Nur Nabilah Abu Mangshor, more

2017 7th IEEE International Conference on System Engineering and Technology (ICSET) > 183 - 188

2017 7th IEEE International Conference on System Engineering and Technology (ICSET)

Speech emotion recognition has been widely used in human computer interaction and applications. This paper has classified emotion into two classes: happy and angry. All the speech signal is preprocessed from Malay spoken speech database. Emotional information is obtained by applying two well-established acoustical features that are Mel Frequency Cepstral Coefficients (MFCC) and Short Time Energy (STE)...

chapter

Implementation of accent recognition methods subsystem for eLearning systems

Eugen Tverdokhleb, Hennadii Dobrovolskyi, Nataliya Keberle, Natalia Myronova

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) > 2 > 1037 - 1041

2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

The results of the implementation of an external accent recognition system and its integration into massive open online courses platform Moodle are reported. Accent recognition becomes important in foreign languages learning to provide a feedback to a student on a presence of a certain unwanted accent in a foreign language pronunciation. Implementation of several accent recognition methods and their...

chapter

Improvement of speech recognition results by a combination of systems

Rama Hasan, Hussein Hussein, Pavlos Lazaridis, Sinan Khwandah, more

2017 23rd International Conference on Automation and Computing (ICAC) > 1 - 4

2017 23rd International Conference on Automation and Computing (ICAC)

The aim of this study is to suggest an algorithm that combines two speech recognition systems. These systems differ in the methods used in the feature extraction stage, but they have the same classifier Hidden Markov Model (HMM). The first system uses Mel-Frequency Cepstrum Coefficients (MFCC), the second one uses Linear Prediction Cepstrum Coefficients (LPCC), and the third system uses Perceptual...

chapter

Speaker-Dependent Isolated-Word Speech Recognition System Based on Vector Quantization

Yinyin Zhao, Lei Zhu

2017 International Conference on Computer Network, Electronic and Automation (ICCNEA) > 133 - 137

2017 International Conference on Computer Network, Electronic and Automation (ICCNEA)

Speaker-dependent speech recognition system requires the system should not only recognize speech, but also recognize the speaker of the segment. In this paper, two indicators are selected—short-time average zero-crossing rate and dual-threshold endpoint to test the signal endpoint through the study of speaker-dependent isolated-word speech characteristics, and MFCC parameters are taken...

chapter

Development of speech emotion recognition system using deep belief networks in malayalam language

Athira Chandran, D. Pravena, D. Govind

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 676 - 680

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

The goal of this work is to validate the impact of natural elicitation of emotions by the speakers during the development of speech emotion databases for Malayalam language. The work also proposes a Gaussian Mixture Model-Deep Belief Networks (GMM-DBN) based speech emotion recognition system. To test the effect of emotion elicitation by the speakers, two independent datasets with emotionally biased...

chapter

Hybrid DWT and MFCC feature warping for noisy forensic speaker verification in room reverberation

Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Vinod Chandran

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) > 434 - 439

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

The robustness of speaker verification systems is often degraded in real forensic applications, which contain environmental noise and reverberation. Reverberation results in mismatched conditions between enrolment and test speech signals. In this work, we investigate the effectiveness of combining features of discrete wavelet transform (DWT) and feature-warped mel frequency cepstral coefficients (MFCCs)...

chapter

GMM based automatic speaker verification system development for forensics in Bahasa Indonesia

Ivan Stefanus, R.S. Joko Sarwono, Miranti Indar Mandasari

2017 5th International Conference on Instrumentation, Control, and Automation (ICA) > 56 - 61

2017 5th International Conference on Instrumentation, Control, and Automation (ICA)

Speaker verification based on phonetic-acoustic approach and text-dependent framework has been applied for forensic purposes in Indonesian court since 2008. In order to accelerate the speaker verification process, an automatic text-independent system is developed. This automatic system employs MFCC features and GMM speaker modeling, a standard and simple approach used in automatic speaker recognition...

chapter

Identification of correlation between blood relations using speech signal

Palli Padmini, Shikha Tripathi, Kaustav Bhowmick

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES) > 1 - 6

2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)

This paper presents a study of how speech features have comparable parameters amongst blood relations. Mel Frequency Cepstral Coefficients (MFCC) has been used for extracting the features of input speech signal, along with vector quantization through modified k-means LBG (Linde, Buzo, and Gray) algorithm are implemented to analyse and estimate the similarity to perform related studies. The study is...

chapter

Novel energy separation based instantaneous frequency features for spoof speech detection

Madhu R. Kamble, Hemant A. Patil

2017 25th European Signal Processing Conference (EUSIPCO) > 106 - 110

2017 25th European Signal Processing Conference (EUSIPCO)

Speech Synthesis (SS) and Voice Conversion (VC) presents a genuine risk of attacks for Automatic Speaker Verification (ASV) technology. In this paper, we evaluate front-end anti-spoofing technique to protect ASV system for SS and VC attack using a standard benchmarking database. In particular, we propose a novel feature set, namely, Energy Separation Algorithm-based Instantaneous Frequency Cosine...

chapter

Speaker verification anti-spoofing using linear prediction residual phase features

Cemal Hanilci

2017 25th European Signal Processing Conference (EUSIPCO) > 96 - 100

2017 25th European Signal Processing Conference (EUSIPCO)

The vulnerability of automatic speaker verification (ASV) systems against spoofing attacks is an important security concern about the reliability of ASV technology. Recently, various countermeasures have been developed for spoofing detection. In this paper, we propose to use features derived from linear prediction (LP) residual signal for spoofing detection using simple Gaussian mixture model (GMM)...

chapter

Variants of mel-frequency cepstral coefficients for improved whispered speech speaker verification in mismatched conditions

Milton Sarria-Paja, Tiago H. Falk

2017 25th European Signal Processing Conference (EUSIPCO) > 91 - 95

2017 25th European Signal Processing Conference (EUSIPCO)

In this paper, automatic speaker verification using normal and whispered speech is explored. Typically, for speaker verification systems, varying vocal effort inputs during the testing stage significantly degrades system performance. Solutions such as feature mapping or addition of multi-style data during training and enrollment stages have been proposed but do not show similar advantages for the...

chapter

Glottal mixture model (GLOMM) for speaker identification on telephone channels

Paul M. Baggenstoss, Kevin Wilkinghoff, Frank Kurth

2017 25th European Signal Processing Conference (EUSIPCO) > 2734 - 2738

2017 25th European Signal Processing Conference (EUSIPCO)

The Glottal Mixture Model (GLOMM) extracts speaker-dependent voice source information from speech data. It has previously been shown to provide speaker identification performance on clean speech comparable to universal background model (UBM), a state of the art method based on MFCC. And, when combined with UBM, the error rate was reduced by a factor of three, showing that the voice source information...

chapter

Blind Source Separation and Identification for Speech Signals

Jie Yin, Zhiliang Liu, Yaqiang Jin, Dandan Peng, more

2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC) > 398 - 402

2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC)

Background noise reduction has been studied for many years. However, unwanted human speech noise suppression is not well discussed due to sparsity of the speech signal. Traditional blind source separation (BSS) methods such as independent component analysis (ICA) assume the prior knowledge of the number of sources and require that the number of sources must equal the number of sensors. Above limitations...

chapter

Speaker recognition based on MFCC and BP neural networks

Yi Wang, Bob Lawlor

2017 28th Irish Signals and Systems Conference (ISSC) > 1 - 4

2017 28th Irish Signals and Systems Conference (ISSC)

Speaker recognition has been developed over many years and it comes with many different methods. MFCC is one of more the successful methods due to it being generally modeled on the human auditory system. It represents high success rate of recognition and strong robustness against noise in the lower frequency regions. However, in the higher frequency regions, it captures speaker characteristics information...

chapter

Significance of teo slope feature in speech emotion recognition

P S Drisya, Rajeev Rajan

2017 International Conference on Networks & Advances in Computational Technologies (NetACT) > 438 - 441

2017 International Conference on Networks & Advances in Computational Technologies (NetACT)

The growth in human computer interaction has necessitated the requirement of accurate recognition of emotion from speech data. This paper presents a new novel feature called TEO (Teager Energy Operator) Slope for emotion recognition. The feature is obtained by applying least square fit instead of applying DCT in TEO feature. The feature was tested on the publically available Berlin Emotion Database...

Data set:
ieee
Keywords:
FEATURE EXTRACTION
MEL FREQUENCY CEPSTRAL COEFFICIENT
SPEECH
Publication type:
book

Publication date

Set your own date range

Content availability

Available (651)
None (3)

Keywords

SPEECH RECOGNITION (353)
TRAINING (149)
HIDDEN MARKOV MODELS (147)
SPEAKER RECOGNITION (147)
MFCC (143)
DATABASES (117)
SPEECH PROCESSING (103)
SUPPORT VECTOR MACHINES (92)
ACCURACY (90)
CEPSTRAL ANALYSIS (76)
NOISE (70)
EMOTION RECOGNITION (69)
FILTER BANKS (50)
SPEAKER IDENTIFICATION (44)
GMM (42)
ROBUSTNESS (42)
GAUSSIAN MIXTURE MODEL (39)
NOISE MEASUREMENT (37)
GAUSSIAN PROCESSES (34)
MATHEMATICAL MODEL (34)
VECTORS (33)
CLASSIFICATION ALGORITHMS (32)
ARTIFICIAL NEURAL NETWORKS (31)
DATA MINING (31)
SPEAKER VERIFICATION (31)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (30)
CORRELATION (28)
TESTING (27)
AUTOMATIC SPEECH RECOGNITION (26)
VECTOR QUANTIZATION (26)
MEL-FREQUENCY CEPSTRAL COEFFICIENTS (24)
SIGNAL TO NOISE RATIO (24)
SVM (24)
COMPUTATIONAL MODELING (23)
DISCRETE COSINE TRANSFORMS (23)
FILTER BANK (23)
AUDIO SIGNAL PROCESSING (22)
HIDDEN MARKOV MODEL (21)
KERNEL (20)
PRINCIPAL COMPONENT ANALYSIS (20)
SIGNAL CLASSIFICATION (20)
SUPPORT VECTOR MACHINE (20)
NATURAL LANGUAGE PROCESSING (18)
FILTERING THEORY (17)
SIGNAL PROCESSING (17)
HMM (16)
MEL-FREQUENCY CEPSTRAL COEFFICIENT (16)
MUSIC (16)
ACOUSTIC SIGNAL PROCESSING (15)
LPC (15)
NEURAL NETWORKS (15)
NIST (15)
COMPUTERS (14)
SUPPORT VECTOR MACHINE CLASSIFICATION (14)
ADAPTATION MODELS (13)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC) (13)
MICROPHONES (13)
NEURAL NETWORK (13)
SPEECH CODING (13)
SPEECH EMOTION RECOGNITION (13)
SPEECH ENHANCEMENT (13)
TIME FREQUENCY ANALYSIS (13)
TRANSFORMS (13)
ALGORITHM DESIGN AND ANALYSIS (12)
DATA MODELS (12)
DISCRETE WAVELET TRANSFORMS (12)
FEATURE SELECTION (12)
GAUSSIAN MIXTURE MODELS (12)
HARMONIC ANALYSIS (12)
INDEXES (12)
LEARNING (ARTIFICIAL INTELLIGENCE) (12)
PATTERN CLASSIFICATION (12)
VECTOR QUANTISATION (12)
WAVELET TRANSFORMS (12)
ACOUSTICS (11)
CEPSTRUM (11)
CLASSIFICATION (11)
CONFERENCES (11)
NEURAL NETS (11)
ROBUST SPEECH RECOGNITION (11)
SPEAKER DIARIZATION (11)
MACHINE LEARNING (10)
PITCH (10)
SPECTRAL ANALYSIS (10)
ACOUSTIC FEATURES (9)
AUDIO CLASSIFICATION (9)
EQUATIONS (9)
ESTIMATION (9)
HEURISTIC ALGORITHMS (9)
NEURONS (9)
POLYNOMIALS (9)
SPEECH ANALYSIS (9)
SPEECH FEATURE EXTRACTION (9)
TRAINING DATA (9)
VISUALIZATION (9)
VQ (9)
ADAPTATION MODEL (8)
more

INFONA - science communication portal

Search results

Classification of multi speaker shouted speech and single speaker normal speech

A Deep Transfer Learning Approach for Improved Post-Traumatic Stress Disorder Diagnosis

Detecting depression in speech: Comparison and combination between different speech types

An improved approach to open set text-independent speaker identification (OSTI-SI)

Unsupervised speaker segmentation framework based on sparse correlation feature

Comparing statistical classifiers for emotion classification

Implementation of accent recognition methods subsystem for eLearning systems

Improvement of speech recognition results by a combination of systems

Speaker-Dependent Isolated-Word Speech Recognition System Based on Vector Quantization

Development of speech emotion recognition system using deep belief networks in malayalam language

Hybrid DWT and MFCC feature warping for noisy forensic speaker verification in room reverberation

GMM based automatic speaker verification system development for forensics in Bahasa Indonesia

Identification of correlation between blood relations using speech signal

Novel energy separation based instantaneous frequency features for spoof speech detection

Speaker verification anti-spoofing using linear prediction residual phase features

Variants of mel-frequency cepstral coefficients for improved whispered speech speaker verification in mismatched conditions

Glottal mixture model (GLOMM) for speaker identification on telephone channels

Blind Source Separation and Identification for Speech Signals

Speaker recognition based on MFCC and BP neural networks

Significance of teo slope feature in speech emotion recognition

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options