Search results

Items from 61 to 80 out of 269 results

chapter

Sparse and cross-term free time-frequency distribution based on Hermite functions

Branka Jokanovic, Moeness Amin

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3696 - 3700

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Hermite functions are an effective tool for improving the resolution of the single-window spectrogram. In this paper, we analyze the Hermite functions in the ambiguity domain and show that the higher order terms can introduce undesirable cross-terms in the multiwindow spectrogram. The optimal number of Hermite functions depends on the location and spread of signal auto-terms in the ambiguity domain...

chapter

Unsupervised learning of acoustic features via deep canonical correlation analysis

Weiran Wang, Raman Arora, Karen Livescu, Jeff A. Bilmes

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4590 - 4594

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

It has been previously shown that, when both acoustic and articulatory training data are available, it is possible to improve phonetic recognition accuracy by learning acoustic features from this multi-view data with canonical correlation analysis (CCA). In contrast with previous work based on linear or kernel CCA, we use the recently proposed deep CCA, where the functional form of the feature mapping...

chapter

Custom-designed SVM kernels for improved robustness of phoneme classification

Jibran Yousafzai, Zoran Cvetkovic, Peter Sollich

2009 17th European Signal Processing Conference > 1765 - 1769

2009 17th European Signal Processing Conference

The robustness of phoneme classification to white Gaussian noise and pink noise in the acoustic waveform domain is investigated using support vector machines. We focus on the problem of designing kernels which are tuned to the physical properties of speech. For comparison, results are reported for the PLP representation of speech using standard kernels. We show that major improvements can be achieved...

chapter

SVM speaker verification using a new sequence Kernel

Jerome Louradour, Khalid Daoudi

2005 13th European Signal Processing Conference > 1 - 4

2005 13th European Signal Processing Conference

Using the framework of Reproducing Kernel Hilbert Spaces, we develop a new sequence kernel that measures similarity between sequences of observations. We then apply it to a text-independent speaker verification task using the NIST 2004 Speaker Recognition Evaluation database. The results show that incorporating our new sequence kernel in an SVM training architecture not only yields performance significantly...

chapter

Comparison of different strategies for a SVM-based audio segmentation

Mathieu Ramona, Gel Richard

2009 17th European Signal Processing Conference > 20 - 24

2009 17th European Signal Processing Conference

We compare in this paper diverse hierarchical and multi-class approaches for the speech/music segmentation task, based on Support Vector Machines, combined with a median filter post-processing. We show the effciency of kernel tuning through the novel Kernel Target Alignment criterion. Quantitative results provide an F-measure of 96.9%, that represents an error reduction of about 50% compared to the...

chapter

Nonlinear signal decomposition into functional series for speech recognition: A new approach

Alexander M. Krot, Polina P. Tkachova, Boris A. Goncharov

2000 10th European Signal Processing Conference > 1 - 4

2000 10th European Signal Processing Conference

The nonlinear speech signal decomposition based on Volterra-Wiener functional series is described. The solution of phoneme recognition problem by means of measuring Wiener kernels is proposed.

chapter

Linear transformation on speech subspace for analysis of speech under stress condition

Bhanu Priya, S. Dandapat

2015 Twenty First National Conference on Communications (NCC) > 1 - 6

2015 Twenty First National Conference on Communications (NCC)

In this work, a novel approach of linear transformation on speech subspace is used to preserve the properties of speech signal under stress condition. It is assumed that, there is another subspace called as speech subspace which exist and contains the properties of speech signal under neutral and stress conditions. Therefore, speech component of stress speech is determined by linear transformation...

chapter

Speech emotion recognition using RBF kernel of LIBSVM

Y. D. Chavhan, B. S. Yelure, K. N. Tayade

2015 2nd International Conference on Electronics and Communication Systems (ICECS) > 1132 - 1135

2015 2nd International Conference on Electronics and Communication Systems (ICECS)

Automatic Speech Emotion Recognition (SER) is a current research topic in the field of Human Computer Interaction (HCI) with wide range of applications. The speech features such as, Mel Frequency cepstrum coefficients (MFCC) and Mel Energy Spectrum Dynamic Coefficients (MEDC) are extracted from speech utterance. The LIBSVM is used as classifier to identify different emotional states such as anger,...

chapter

FLOSS as a Source for Profanity and Insults: Collecting the Data

Megan Squire, Rebecca Gazda

2015 48th Hawaii International Conference on System Sciences > 5290 - 5298

2015 48th Hawaii International Conference on System Sciences (HICSS)

An important task in machine learning and natural language processing is to learn to recognize different types of human speech, including humor, sarcasm, insults, and profanity. In this paper we describe our method to produce test and training data sets to assist in this task. Our test data sets are taken from the domain of free, libre, and open source software (FLOSS) development communities. We...

chapter

Ensemble Nyström method for predicting conflict level from speech

Dong-Yan Huang, Haizhou Li, Minghui Dong

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 5

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

The Nyström method is an efficient technique for scaling kernel learning to very large data sets with more than millions. Instead of computing kernel matrix, it is to approximate a kernel learning problem with a linear prediction problem. We propose an ensemble Nyström method for high dimensional prediction of conflict level from speech. The experiments have been conducted over SSPNet Conflict Corpus,...

chapter

Emotion Detection through Speech and Facial Expressions

Krishna Mohan Kudiri, Abas Md. Said, M. Yunus Nayan

2014 International Conference on Computer Assisted System in Health > 26 - 31

2014 International Conference on Computer Assisted System in Health (CASH)

Human machine interaction is one of the most burgeoning area of research in the field of information technology. To date a majority of research in this field has been conducted using unimodal and multimodal systems with asynchronous data. Because of the above, the improper synchronization, which has become a common problem, due to that, the system complexity increases and the system response time...

chapter

Safe Transmission of Text Files through a New Audio Steganography Technique

Mayank Punetha, Ravi Kumar, Mahua Bhattacharya, Neelam Jain, more

2014 2nd International Symposium on Computational and Business Intelligence > 58 - 62

2014 2nd International Symposium on Computational and Business Intelligence (ISCBI)

Steganography is a concept of hiding information in order for data to remain safe and unhandled by eve droppers. In this paper we are demonstrating a way to transmit data from sender to receiver without being handled by eve through a new technique of steganography. We are using an audio file for hiding our data as audio are very less judged to changes made to them. Audio files in wav form are represented...

chapter

Classification of patient's reaction in language assessment during awake craniotomy

Toshihiko Nishimura, Tomoharu Nagao, Hiroshi Iseki, Yoshihiro Muragaki, more

2014 IEEE 7th International Workshop on Computational Intelligence and Applications (IWCIA) > 207 - 212

2014 IEEE 7th International Workshop on Computational Intelligence and Applications (IWCIA)

Surgical video recording is widely used in operation rooms in order to analyze such as surgical procedures and intraoperative incident detection. Therefore, a number of useful operation video records are stored in the hospitals. It is considered that these video records contain significant information, so it is needed to utilize these video data. In awake craniotomy, which is one of the advanced neurological...

chapter

A Case Study on Back-End Voice Activity Detection for Distributed Specch Recognition System Using Support Vector Machines

Azzedine Touazi, Mohamed Debyeche

2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems > 21 - 26

2014 Tenth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS)

Recently, the Voice Activity Detection (VAD) algorithms based on machine learning techniques have shown impressive results in the area of speech recognition. In this paper, we present a case study and we discuss the performance of VAD based on Support Vector Machines (SVM) for Distributed Speech Recognition (DSR) system. In this case study, the speech and the non-speech frames are detected from the...

chapter

Multitask speaker profiling for estimating age, height, weight and smoking habits from spontaneous telephone speech signals

Amir Hossein Poorjam, Mohamad Hasan Bahari, Hugo Van hamme

2014 4th International Conference on Computer and Knowledge Engineering (ICCKE) > 7 - 12

2014 4th International eConference on Computer and Knowledge Engineering (ICCKE)

This paper proposes a novel approach for automatic estimation of four important traits of speakers, namely age, height, weight and smoking habit, from speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a...

chapter

A speaker recognition algorithm based on factor analysis

Xuanjing Shen, Yujie Zhai, Yu Wang, Haipeng Chen

2014 7th International Congress on Image and Signal Processing > 897 - 901

2014 7th International Congress on Image and Signal Processing (CISP)

Channel interference factor for the identification result is prevalent among the existing speaker recognition algorithms. In order to improve the accuracy of the algorithm, the paper utilizes the technique of latent factor analysis(LFA) to deal with the channel factors in the speaker's Gaussian Mixture Model(GMM). In the endpoint detection phase of speaker recognition, the algorithm introduces the...

chapter

Kernel ridge regression method applied to speech recognition problem: A novel approach

Hoang Trang, Loc Tran

2014 International Conference on Advanced Technologies for Communications (ATC 2014) > 172 - 174

2014 International Conference on Advanced Technologies for Communications (ATC)

Speech recognition is the important problem in pattern recognition research field. In this paper, the kernel ridge regression method is proposed to be applied to the MFCC feature vectors of the speech dataset available from IC Design lab at Faculty of Electricals-Electronics Engineering, University of Technology, Ho Chi Minh City. Experiment results show that the kernel ridge regression method outperforms...

chapter

Speaker Adaptation Using Nonlinear Regression Techniques for HMM-Based Speech Synthesis

Doo Hwa Hong, Shin Jae Kang, Joun Yeop Lee, Nam Soo Kim

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 586 - 589

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

The maximum likelihood linear regression (MLLR) technique is a well-known approach to parameter adaptation in hidden Markov model (HMM)-based systems. In this paper, we propose the maximum penalized likelihood kernel regression (MPLKR) approach as a novel adaptation technique for HMM-based speech synthesis. The proposed algorithm performs a nonlinear regression between the mean vector of the base...

chapter

eBear: An expressive Bear-Like robot

Xiao Zhang, Ali Mollahosseini, Amir H. Kargar B., Evan Boucher, more

The 23rd IEEE International Symposium on Robot and Human Interactive Communication > 969 - 974

2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication

This paper presents an anthropomorphic robotic bear for the exploration of human-robot interaction including verbal and non-verbal communications. This robot is implemented with a hybrid face composed of a mechanical faceplate with 10 DOFs and an LCD-display-equipped mouth. The facial emotions of the bear are designed based on the description of the Facial Action Coding System as well as some animal-like...

chapter

An integrated spoken language recognition system using support vector machines

Garima Vyas, Malay Kishore Dutta

2014 Seventh International Conference on Contemporary Computing (IC3) > 105 - 108

2014 Seventh International Conference on Contemporary Computing (IC3)

An automatic Language Identification (LID) is a system designed to recognize a language from a given spoken utterance. The spoken utterances are classified according to the pre-defined set of languages. In this paper a LID system is designed for two different languages namely English and French. The classification of an audio sample is done by extracting Mel frequency cepstral coefficients (MFCCs)...

Keywords:
KERNEL
SPEECH

Publication date

Set your own date range

Content availability

Available (268)
None (1)

Keywords

SUPPORT VECTOR MACHINES (134)
SPEECH RECOGNITION (91)
TRAINING (88)
FEATURE EXTRACTION (84)
SPEAKER RECOGNITION (52)
MEL FREQUENCY CEPSTRAL COEFFICIENT (38)
DATA MINING (34)
SPEECH PROCESSING (34)
HIDDEN MARKOV MODELS (33)
SUPPORT VECTOR MACHINE (32)
ACCURACY (29)
EMOTION RECOGNITION (23)
GAUSSIAN PROCESSES (23)
NOISE (23)
VECTORS (23)
ACOUSTICS (22)
SVM (22)
PRINCIPAL COMPONENT ANALYSIS (21)
NIST (19)
DATABASES (18)
LEARNING (ARTIFICIAL INTELLIGENCE) (18)
ROBUSTNESS (17)
PATTERN CLASSIFICATION (16)
GAUSSIAN MIXTURE MODEL (15)
ESTIMATION (14)
SPEAKER VERIFICATION (13)
ADAPTATION MODEL (12)
ARTIFICIAL NEURAL NETWORKS (12)
SUPPORT VECTOR MACHINE CLASSIFICATION (12)
NATURAL LANGUAGE PROCESSING (11)
COMPUTATIONAL MODELING (10)
MACHINE LEARNING (10)
POLYNOMIALS (10)
SIGNAL CLASSIFICATION (10)
SPEAKER IDENTIFICATION (10)
SPEECH ENHANCEMENT (10)
CLASSIFICATION ALGORITHMS (9)
ENCODING (9)
MATHEMATICAL MODEL (9)
MFCC (9)
CLASSIFICATION (8)
REVERBERATION (8)
SIGNAL PROCESSING (8)
SIGNAL TO NOISE RATIO (8)
SPEECH EMOTION RECOGNITION (8)
TESTING (8)
TIME-FREQUENCY ANALYSIS (8)
CEPSTRAL ANALYSIS (7)
COVARIANCE MATRIX (7)
ENTROPY (7)
EQUATIONS (7)
ERROR ANALYSIS (7)
GAUSSIAN MIXTURE MODELS (7)
KERNEL FUNCTION (7)
NOISE MEASUREMENT (7)
RADIAL BASIS FUNCTION NETWORKS (7)
REPRODUCING KERNEL HILBERT SPACE (7)
SPECTROGRAM (7)
TRAINING DATA (7)
ADAPTIVE FILTERS (6)
MICROPHONES (6)
MULTIPLE KERNEL LEARNING (6)
REGRESSION ANALYSIS (6)
SPEECH SIGNAL (6)
ACOUSTIC SIGNAL PROCESSING (5)
AUDIO SIGNAL PROCESSING (5)
CONFERENCES (5)
CORRELATION (5)
DATA MODELS (5)
DELAY (5)
FUZZY SET THEORY (5)
HILBERT SPACES (5)
I-VECTOR (5)
NONLINEAR FILTERS (5)
OPTIMIZATION (5)
PROBABILITY DENSITY FUNCTION (5)
SPEECH SYNTHESIS (5)
STATISTICAL ANALYSIS (5)
VECTOR QUANTIZATION (5)
ADAPTIVE VOLTERRA FILTER (4)
APPROXIMATION METHODS (4)
BLIND SOURCE SEPARATION (4)
CEPSTRUM (4)
CLUSTERING ALGORITHMS (4)
DECODING (4)
DICTIONARIES (4)
ECHO HIDING (4)
ECHO SUPPRESSION (4)
FEATURE SELECTION (4)
GAUSSIAN KERNEL (4)
GMM (4)
LATTICES (4)
MEL FREQUENCY CEPSTRAL COEFFICIENTS (4)
NATURAL LANGUAGES (4)
NOISE REDUCTION (4)
NONLINEAR MAPPING (4)
PARTICLE SWARM OPTIMIZATION (4)
PATHOLOGY (4)
more

INFONA - science communication portal

Search results

Sparse and cross-term free time-frequency distribution based on Hermite functions

Unsupervised learning of acoustic features via deep canonical correlation analysis

Custom-designed SVM kernels for improved robustness of phoneme classification

SVM speaker verification using a new sequence Kernel

Comparison of different strategies for a SVM-based audio segmentation

Nonlinear signal decomposition into functional series for speech recognition: A new approach

Linear transformation on speech subspace for analysis of speech under stress condition

Speech emotion recognition using RBF kernel of LIBSVM

FLOSS as a Source for Profanity and Insults: Collecting the Data

Ensemble Nyström method for predicting conflict level from speech

Emotion Detection through Speech and Facial Expressions

Safe Transmission of Text Files through a New Audio Steganography Technique

Classification of patient's reaction in language assessment during awake craniotomy

A Case Study on Back-End Voice Activity Detection for Distributed Specch Recognition System Using Support Vector Machines

Multitask speaker profiling for estimating age, height, weight and smoking habits from spontaneous telephone speech signals

A speaker recognition algorithm based on factor analysis

Kernel ridge regression method applied to speech recognition problem: A novel approach

Speaker Adaptation Using Nonlinear Regression Techniques for HMM-Based Speech Synthesis

eBear: An expressive Bear-Like robot

An integrated spoken language recognition system using support vector machines

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options