Search results for: Haizhou Li

Items from 1 to 10 out of 10 results

chapter

Adaptation of PLDA for multi-source text-independent speaker verification

Liping Chen, Kong Aik Lee, Bin Ma, Long Ma, more

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5380 - 5384

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Probabilistic linear discriminant analysis (PLDA) is widely described as an effective model for text-independent speaker verification in the i-vector space. The PLDA scoring function is typically formulated as the likelihood ratio between the speaker-adapted and the universal PLDAs. In this case, the adaptation of PLDA was performed through the speaker factors. In this paper, we show that the channel...

chapter

I-vector based deep neural network acoustic model adaptation using multilingual language resource

Haihua Xu, Wei Rao, Xiong Xiao, Hao Huang, more

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 5

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

I-vector adaptation of DNN-HMM acoustic models has shown clear performance improvement for speech recognition. In this paper, we study this technique on Babel task. we use Swahili as target language (training data of 50 hours) and another 6 languages as multilingual resources to train i-vector extractors respectively. Our study shows that i-vector extractors trained with more multilingual data only...

chapter

Multi-channel feature adaptation for robust speech recognition

Zhaofeng Zhang, Xiong Xiao, Longbiao Wang, Jianwu Dang, more

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

In this paper, we propose a feature adaptation method that combines speech features from multiple microphone channels for robust automatic speech recognition (ASR). The proposed method first transforms the features in all channels using channel-dependent linear transforms, and then sum the channels into one channel for acoustic modeling. The transform parameters are estimated by maximizing the likelihood...

article

Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition

Duc Hoang Ha Nguyen, Xiong Xiao, Eng Siong Chng, Haizhou Li

IEEE/ACM Transactions on Audio, Speech, and Language Processing > 2016 > 24 > 6 > 1006 - 1019

Spectral information represents short-term speech information within a frame of a few tens of milliseconds, while temporal information captures the evolution of speech statistics over consecutive frames. Motivated by the findings that human speech comprehension relies on the integrity of both the spectral content and temporal envelope of speech signal, we study a spectro-temporal transform framework...

chapter

Channel adaptation of plda for text-independent speaker verification

Liping Chen, Kong Aik Lee, Bin Ma, Wu Guo, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5251 - 5255

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Probabilistic linear discriminant analysis (PLDA) has shown to be effective for modeling channel variability in the i-vector space for text-independent speaker verification. Speaker verification is a binary hypothesis testing. Given a test segment, the verification score could be computed as the log-likelihood ratio between a speaker-adapted PLDA and the universal PLDA model. This work proposes to...

chapter

Phonotactic spoken language recognition: Using diversely adapted acoustic models in parallel phone recognizers

Cheung-Chi Leung, Bin Ma, Haizhou Li

2012 8th International Symposium on Chinese Spoken Language Processing > 108 - 111

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In phonotactic spoken language recognition systems, acoustic model adaptation prior to phone lattice decoding has been adopted to deal with the mismatch between training and test conditions. Moreover, combining diversified phonotactic features is commonly used. These motivate us to have an in-depth investigation of combining diversified phonotactic features from diversely adapted acoustic models....

chapter

An analysis of vector Taylor series model compensation for non-stationary noise in speech recognition

Duc Hoang Ha Nguyen, Xiong Xiao, Eng Siong Chng, Haizhou Li

2012 8th International Symposium on Chinese Spoken Language Processing > 131 - 135

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In this paper, we investigate a feature conditioning method for the VTS-based model compensation. The VTS is a technique that predicts noisy acoustic model from clean acoustic model and noise model. It is noted that most of the previous studies use a single Gaussian noise model, which is unable to model noise statistics well, especially in non-stationary noisy environments. In this paper, we propose...

chapter

Lasso environment model combination for robust speech recognition

Xiong Xiao, Jinyu Li, Eng Siong Chng, Haizhou Li

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4305 - 4308

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we propose a novel acoustic model adaptation method for noise robust speech recognition. Model combination is a common way to adapt acoustic models to a target test environment. For example, the mean supervectors of the adapted model are obtained as a linear combination of mean supervectors of many pre-trained environment-dependent acoustic models. Usually, the combination weights are...

chapter

Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech

Tomi Kinnunen, Zhi-Zheng Wu, Kong Aik Lee, Filip Sedlak, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4401 - 4404

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Voice conversion - the methodology of automatically converting one's utterances to sound as if spoken by another speaker - presents a threat for applications relying on speaker verification. We study vulnerability of text-independent speaker verification systems against voice conversion attacks using telephone speech. We implemented a voice conversion systems with two types of features and nonparallel...

chapter

Maximum likelihood adaptation of histogram equalization with constraint for robust speech recognition

Xiong Xiao, Jinyu Li, Eng Siong Chng, Haizhou Li

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5480 - 5483

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we propose a novel feature space adaptation technique to improve the robustness of speech recognition in noisy environments. Histogram equalization (HEQ) is an effective technique for improving robustness by reducing the difference between clean and noisy features. A weakness of HEQ is that it does not take into account acoustic model, resulting in possible mismatch between HEQ-processed...

Filter options

Keywords:
ADAPTATION MODELS

Publication date

Set your own date range

Publication type

book (9)
article (1)

Keywords

SPEECH (7)
ACOUSTICS (6)
TRAINING (6)
SPEECH RECOGNITION (5)
HIDDEN MARKOV MODELS (4)
FEATURE ADAPTATION (3)
ROBUST SPEECH RECOGNITION (3)
CHANNEL ADAPTATION (2)
CHANNEL ESTIMATION (2)
COMPUTATIONAL MODELING (2)
COVARIANCE MATRICES (2)
ESTIMATION (2)
MAXIMUM LIKELIHOOD (2)
NIST (2)
NOISE (2)
SPEAKER VERIFICATION (2)
TRANSFORMS (2)
VECTORS (2)
ADAPTATION (1)
ARRAY SIGNAL PROCESSING (1)
BEAMFORMING (1)
CHANNEL PRIOR ESTIMATION (1)
CHIME (1)
DATA MINING (1)
DATA MODELS (1)
DEEP NEURAL NETWORK (1)
FEATURE EXTRACTION (1)
FEATURE NORMALIZATION (1)
HISTOGRAM EQUALIZATION (1)
HISTOGRAMS (1)
I-VECTOR (1)
JOINTS (1)
L<INF>1</INF> REGULARIZATION (1)
LASSO REGRESSION (1)
LATTICES (1)
LINEAR PROGRAMMING (1)
LINEAR TRANSFORM (1)
MATHEMATICAL MODEL (1)
MAXIMUM LIKELIHOOD ESTIMATION (1)
MICROPHONES (1)
MLLR ADAPTATION (1)
MODEL ADAPTATION (1)
MODEL COMBINATION (1)
MULTI-SOURCE SPEAKER VERIFICATION (1)
MULTILINGUAL (1)
NOISE MEASUREMENT (1)
NOISE ROBUST SPEECH RECOGNITION (1)
PHONE LATTICE (1)
PHONE RECOGNIZER (1)
PLDA SCORING (1)
PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS (1)
ROBUSTNESS (1)
SECURITY (1)
SPEAKER ADAPTATION (1)
SPEECH PROCESSING (1)
SPOKEN LANGUAGE RECOGNITION (1)
SUPPORT VECTOR MACHINES (1)
TEMPORAL FILTERING (1)
TRAJECTORY (1)
VOICE CONVERSION (1)
more

INFONA - science communication portal

Search results for: Haizhou Li

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options