Search results for: Bo Xu

Items from 1 to 6 out of 6 results

chapter

An investigation of summed-channel speaker recognition with multi-session enrollment

Shanshan Zhang, Ce Zhang, Rong Zheng, Bo Xu

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1640 - 1644

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper describes a general framework of speaker recognition on summed-channel condition for both enrolling and test data. We present several methods for clustering the target speaker who is involved in multiple summed-channel enrolling excerpts. In our approach, each excerpt is segmented separately by a speaker diarization system as the first stage. Then segments belonging to the same speaker...

chapter

Exploring nuisance attribute projection and score normalization for GLDS-SVM based automatic mispronunciation detection method

HongYan Li, Shen Huang, ShiJin Wang, JiaEn Liang, more

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5668 - 5671

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In the task of mispronunciation detection, the cross-speaker degradation and some other confusing nuisances are the challenging problems demanding prompt solution. In this paper, we will attempt to remove the non-pronunciation variations in the GLDS-SVM expansion space by using nuisance attribute projection strategy, in order to increase the separating capacity between different phoneme instances...

chapter

Residual Factor Analysis for Text-Independent Speaker Verification

Lei Zhu, Rong Zheng, Bo Xu

2009 Chinese Conference on Pattern Recognition > 1 - 5

2009 Chinese Conference on Pattern Recognition. (CCPR 2009) and the First CJK Joint Workshop on Pattern Recognition (CJKPR)

Joint factor analysis (JFA) has become the state-of-the-art technique in the problem of speaker verification. At the same time, the training of eigenvoice matrix seems to be a heavy burden to us, because it requires lots of multi-channel data, which largely determines the performance of the system. In this paper, we first try to exploit an upper bound performance of the JFA system in a non-normal...

chapter

Applying Restrained Likelihood and Floating TMR to Multi-Speaker Identification for Co-channel Speech

Yong Guan, Peng Li, Xueliang Zhang, Wenju Liu, more

2007 International Conference on Natural Language Processing and Knowledge Engineering > 459 - 462

2007 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE '07)

In this paper, a multi-speaker identification system for co-channel speech is proposed. By using constrained likelihood and floating TMR method, this system can identify two speakers on co-channel speech with high accuracy.

chapter

Combining Machine Learning and Computational Auditory Scene Analysis to Separate Monaural Speech of Two-Talker

Peng Li, Yong Guan, Wenju Liu, Bo Xu

2007 International Conference on Natural Language Processing and Knowledge Engineering > 280 - 284

2007 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE '07)

Monaural speech separation is one of the most difficult problems in speech signal processing. In this paper, a new method based on machine learning and computational auditory scene analysis (CASA) is suggested to separate the monaural speech of two-talker. The technique of machine learning is used to learn the grouping cues on isolated clean data from single speaker. By using a factorial-max vector...

chapter

A Two-level Method for Unsupervised Speaker-based Audio Segmentation

Shilei Zhang, Shuwu Zhang, Bo Xu

18th International Conference on Pattern Recognition (ICPR'6) > 4 > 298 - 301

2006 18th International Conference on Pattern Recognition

In this paper, we propose a two-level segmentation method that detects speaker changes in a continuous audio stream effectively. In our approach, we divide the change detection process into two levels: region level that detects the potential change regions containing candidate speaker change points, and boundary level that searches and refines the true change points. At the region level, we employ...

Filter options

Keywords:
SPEAKER RECOGNITION

Publication date

Set your own date range

Content availability

Available (4)
None (2)

Keywords

SPEECH (2)
SPEECH PROCESSING (2)
TRAINING (2)
AUDIO SIGNAL PROCESSING (1)
AUTOMATIC MISPRONUNCIATION DETECTION (1)
BAYES METHODS (1)
BAYESIAN INFORMATION CRITERION ALGORITHM (1)
COCHANNEL SPEECH (1)
COMPUTATIONAL AUDITORY SCENE ANALYSIS (1)
COMPUTATIONAL MODELING (1)
CONTINUOUS AUDIO STREAM (1)
EIGENVALUES AND EIGENFUNCTIONS (1)
EIGENVOICE MATRIX TRAINING (1)
FACTORIAL-MAX VECTOR QUANTIZATION MODEL (1)
FLOATING TMR METHOD (1)
GENERALIZED LINEAR DISCRIMINANT SEQUENCE (1)
JOINT FACTOR ANALYSIS (1)
KERNEL (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
MACHINE LEARNING (1)
MATRIX ALGEBRA (1)
MODIFIED GENERALIZED LIKELIHOOD RATIO (1)
MONAURAL SPEECH SEPARATION (1)
MULTI-SESSION (1)
MULTISPEAKER IDENTIFICATION SYSTEM (1)
NIST (1)
NIST 2006 SPEAKER RECOGNITION EVALUATION (1)
NUISANCE ATTRIBUTE PROJECTION (1)
RESIDUAL FACTOR ANALYSIS (1)
RESIDUAL VECTOR (1)
RESTRAINED LIKELIHOOD METHOD (1)
SCORE NORMALIZATION (1)
SPEAKER CHANGE DETECTION (1)
SPEAKER CLUSTERING (1)
SPEECH RECOGNITION (1)
SPEECH SIGNAL PROCESSING (1)
SRE 06 (1)
SUMMED-CHANNEL (1)
SUPPORT VECTOR MACHINES (1)
T² ALGORITHM (1)
TESTING (1)
TEXT-INDEPENDENT SPEAKER VERIFICATION (1)
UNSUPERVISED SPEAKER-BASED AUDIO SEGMENTATION (1)
UPPER BOUND (1)
VECTOR QUANTISATION (1)
more

INFONA - science communication portal

Search results for: Bo Xu

An investigation of summed-channel speaker recognition with multi-session enrollment

Exploring nuisance attribute projection and score normalization for GLDS-SVM based automatic mispronunciation detection method

Residual Factor Analysis for Text-Independent Speaker Verification

Applying Restrained Likelihood and Floating TMR to Multi-Speaker Identification for Co-channel Speech

Combining Machine Learning and Computational Auditory Scene Analysis to Separate Monaural Speech of Two-Talker

A Two-level Method for Unsupervised Speaker-based Audio Segmentation

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options