Search results for: Ting Huang

Items from 1 to 7 out of 7 results

chapter

An analysis of convolutional neural networks for speech recognition

Jui-Ting Huang, Jinyu Li, Yifan Gong

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4989 - 4993

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Despite the fact that several sites have reported the effectiveness of convolutional neural networks (CNNs) on some tasks, there is no deep analysis regarding why CNNs perform well and in which case we should see CNNs' advantage. In the light of this, this paper aims to provide some detailed analysis of CNNs. By visualizing the localized filters learned in the convolutional layer, we show that edge...

chapter

Recognition of multilingual speech in mobile applications

Hui Lin, Jui-ting Huang, Francoise Beaufays, Brian Strope, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4881 - 4884

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

We evaluate different architectures to recognize multilingual speech for real-time mobile applications. In particular, we show that combining the results of several recognizers greatly outperforms other solutions such as training a single large multilingual system or using an explicit language identification system to select the appropriate recognizer. Experiments are conducted on a trilingual English-French-Mandarin...

chapter

Assistive Technology for the Struggling Learner: Chinese PCS Editing Processor

Chi Nung Chu, Yu Ting Huang

2011 Fifth International Conference on Genetic and Evolutionary Computing > 150 - 152

2011 Fifth International Conference on Genetic and Evolutionary Computing (ICGEC)

There are learning and emotional difficulties for the children with communication disorders which involve a wide variety of problems in speech, language, and hearing. This paper aims at developing a Chinese PCS Editing Processor with Picture Communication Symbols (PCS), Chinese Text-to-Speech Engine and recording engine to improve the social interactivity and learning environment for the children...

chapter

Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

Ting Huang, Yingchun Yang

2010 20th International Conference on Pattern Recognition > 1614 - 1617

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Pitch mismatch between training and testing is one of the important factors causing the performance degradation of the speaker recognition system. In this paper, we adopted the missing feature theory and specified the Unreliable Region (UR) as the parts of the utterance with high emotion induced pitch variation. To model these regions, a virtual HD (High Different from neutral, with large pitch offset)...

chapter

Discriminative training methods for language models using conditional entropy criteria

Jui-Ting Huang, Xiao Li, Alex Acero

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5182 - 5185

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper addresses the problem of discriminative training of language models that does not require any transcribed acoustic data. We propose to minimize the conditional entropy of word sequences given phone sequences, and present two settings in which this criterion can be applied. In an inductive learning setting, the phonetic/acoustic confusability information is given by a general phone error...

chapter

Kernel metric learning for phonetic classification

Jui-Ting Huang, Xi Zhou, M. Hasegawa-Johnson, T. Huang

2009 IEEE Workshop on Automatic Speech Recognition&Understanding > 141 - 145

2009 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU 2009)

While a sound spoken is described by a handful of frame-level spectral vectors, not all frames have equal contribution for either human perception or machine classification. In this paper, we introduce a novel framework to automatically emphasize important speech frames relevant to phonetic information. We jointly learn the importance of speech frames by a distance metric across the phone classes,...

chapter

Pitch envelope based frame level score reweighed algorithm for emotion robust speaker recognition

Dongdong Li, Yingchun Yang, Ting Huang

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops > 1 - 4

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops (ACII 2009)

Speech with various emotions aggravates the performance of speaker recognition systems. In this paper, a novel score normalization approach called pitch envelope based frame level score reweighted (PFLSR) algorithm is introduced to compensate the influence of the affective speech on speaker recognition. The approach assumes that the maximum likelihood model is not easily changed with the expressive...

Filter options

Keywords:
SPEECH

Publication date

Set your own date range

Keywords

SPEECH RECOGNITION (5)
TRAINING (5)
HIDDEN MARKOV MODELS (3)
ACCURACY (2)
EMOTION RECOGNITION (2)
MATHEMATICAL MODEL (2)
MEL FREQUENCY CEPSTRAL COEFFICIENT (2)
SPEAKER RECOGNITION (2)
SPEECH PROCESSING (2)
ACOUSTIC CONFUSABILITY INFORMATION (1)
ACOUSTIC MODELING (1)
ACOUSTIC SIGNAL PROCESSING (1)
ACOUSTICS (1)
ADAPTATION MODEL (1)
ASSISTIVE TECHNOLOGY (1)
AUDITORY SYSTEM (1)
BIMODEL EMOTIONAL SPEAKER RECOGNITION (1)
CHINESE PCS EDITING PROCESSOR (1)
CONDITIONAL ENTROPY (1)
CONDITIONAL ENTROPY CRITERIA (1)
CONVOLUTION (1)
CONVOLUTIONAL NEURAL NETWORKS (1)
DATA MINING (1)
DATABASES (1)
DISCRIMINATIVE TRAINING (1)
DISCRIMINATIVE TRAINING METHOD (1)
DNN (1)
EDUCATION (1)
EMOTION ROBUST SPEAKER RECOGNITION (1)
EMOTIONAL SPEAKER RECOGNITION (1)
ENGINES (1)
ENTROPY (1)
EQUATIONS (1)
FEATURE EXTRACTION (1)
FIRST PASS DECODING EXPERIMENT (1)
FRAME LEVEL SCORE REWEIGHED ALGORITHM (1)
GENERAL PHONE ERROR MODEL (1)
GMM-UBM (1)
HIGH DEFINITION VIDEO (1)
HIGH EMOTION INDUCED PITCH VARIATION (1)
INDUCTIVE LEARNING (1)
KERNEL (1)
KERNEL METRIC LEARNING (1)
LANGUAGE MODEL (1)
LEARNING BY EXAMPLE (1)
LOW FOOTPRINT MODELS (1)
MANDARIN AFFECTIVE SPEECH CORPUS (1)
MASC (1)
MAXIMUM LIKELIHOOD MODEL (1)
MAXOUT UNITS (1)
MISSING FEATURE THEORY (1)
MOBILE COMMUNICATION (1)
MULTILINGUAL SPEECH RECOGNITION (1)
NEURAL NETWORKS (1)
PERFORMANCE DEGRADATION (1)
PHONE SEQUENCE (1)
PHONETIC CLASSIFICATION (1)
PHONETIC CONFUSABILITY INFORMATION (1)
PICTURE COMMUNICATION SYMBOLS (1)
PITCH ENVELOPE (1)
PITCH MISMATCH (1)
PITCH TRANSFORMATION ALGORITHM (1)
POLYNOMIAL TRANSFORMATION FUNCTION (1)
POLYNOMIALS (1)
PROBABILITY DENSITY FUNCTION (1)
SPEECH FRAMES EMPHASIS (1)
SPEECH RECOGNITION FRAMEWORK (1)
SPEECH RECOGNIZER (1)
STATISTICAL ANALYSIS (1)
STATISTICAL MODELS (1)
STRONTIUM (1)
SUPPORT VECTOR MACHINES (1)
TEST SET ACOUSTICS (1)
TRAINING DATA (1)
UNSUPERVISED TRAINING (1)
VIRTUAL HD MODEL (1)
VIRTUAL HD MODEL LEARNING (1)
WORD SEQUENCE (1)
WRITING (1)
more

INFONA - science communication portal

Search results for: Ting Huang

An analysis of convolutional neural networks for speech recognition

Recognition of multilingual speech in mobile applications

Assistive Technology for the Struggling Learner: Chinese PCS Editing Processor

Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

Discriminative training methods for language models using conditional entropy criteria

Kernel metric learning for phonetic classification

Pitch envelope based frame level score reweighed algorithm for emotion robust speaker recognition

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options