Search results for: Yonghong Yan

Items from 61 to 80 out of 92 results

chapter

Commentator's Speech Extraction in Audio Stream of Sports Games

Li Lu, Fengpei Ge, Qingwei Zhao, Yonghong Yan

2009 International Conference on Research Challenges in Computer Science > 64 - 67

2009 International Conference on Research Challenges in Computer Science (ICRCCS 2009)

This paper proposes a method to deal with the problem of extracting commentator's speech in audio stream of live sports games. First, a two-pass metric-based audio segmentation module is developed to segment the audio stream into short ones with homogeneous acoustic features. Then a model-based classification module is adopted to extract the speech segments. For robust audio classification, various...

chapter

Applying Articulatory Features to Speech Emotion Recognition

Yu Zhou, Yanqing Sun, Lin Yang, Yonghong Yan

2009 International Conference on Research Challenges in Computer Science > 73 - 76

2009 International Conference on Research Challenges in Computer Science (ICRCCS 2009)

In this paper, we present an approach that using articulatory features (AFs) derived from spectral features for speech emotion recognition. Also, we investigated the combination of AFs and spectral features. Systems based on AFs only and combined spectral-articulatory features are tested on the CASIA Mandarin emotional corpus. Experiments results show that AFs alone are not suitable for speech emotion...

chapter

An Mandarin Pronunciation Quality Assessment System Using Two Kinds of Acoustic Models

Fengpei Ge, Li Lu, Changliang Liu, Fuping Pan, more

2009 International Conference on Research Challenges in Computer Science > 68 - 72

2009 International Conference on Research Challenges in Computer Science (ICRCCS 2009)

This paper presents our Mandarin pronunciation quality assessment system for the examination of Putonghua Shuiping Kaoshi (PSK) and investigates some measures to improve the assessment accuracy. In this paper, a selective speaker adaptation method is studied. In the adaptation module, we select well pronounced speech as the adaptation data, and adopt Maximum Likelihood Linear Regression (MLLR) to...

chapter

Language model adaptation using auto-induced semantic structures in a voice search system

Yali Li, Ta Li, Yonghong Yan

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems > 3 > 350 - 353

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS 2009)

In this paper, we study how to generate in-domain data for statistical language model adaptation in a Chinese voice search dialogue system. Given limited amount of in-domain data, we use unsupervised clustering to induce semantic classes and structures from the first part of test data. These structures are further augmented with domain information to generate large amount of in-domain data. Lastly...

chapter

A Keyword Spotting Based Sports Type Determination System

Li Lu, Ran Xu, Fengpei Ge, Qingwei Zhao, more

2009 International Conference on Artificial Intelligence and Computational Intelligence > 2 > 361 - 365

2009 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2009)

This paper proposes a novel system to automatically determine the sports type of a sports game by conducting keywords spotting on short fragments (around 10 minutes) of a sports game. In this system, we first develop an audio segmentation module as a front-end to separate announcers' speech efficiently from the complex sports audio stream. Then we employ speech recognition technology on these speech...

chapter

Automatic Detection of Pathological Voices Using GMM-SVM Method

Xiang Wang, Jianping Zhang, Yonghong Yan

2009 2nd International Conference on Biomedical Engineering and Informatics > 1 - 4

2009 2nd International Conference on Biomedical Engineering and Informatics (BMEI)

Modern lifestyle has increased the risk of pathological voices problems. So the therapy of pathological people attracts more attention of people. Meanwhile, acoustic features have been used widely in the therapy of voice disordered people. Classification of Normal and Pathological people is also an auxiliary therapy operation. MFCC has been proved to be a useful feature with traditional classifier...

chapter

Sample-Based Automatic Dictionary Generation for Keyword Spotting System

Li Lu, Fengpei Ge, Ta Li, Qingwei Zhao, more

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 5 > 505 - 508

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

In this paper we develop an approach to automatic, data-driven generation of pronunciation dictionaries for keyword spotting(KWS) systems. In practical applications, KWS tasks often have to deal with keywords whose pronunciations can not be found in the dictionary. To solve this problem, we study how to derive pronunciations automatically from speech samples of keywords. Recognized sequences from...

chapter

Investigations to Minimum Phone Error Training in Bilingual Speech Recognition

Ran Xu, Qingqing Zhang, Jielin Pan, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 486 - 490

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

The great success of Minimum Phone Error (MPE) training criterion in mono-language large vocabulary continuous speech recognition (LVCSR) tasks motivates us to apply it to bilingual LVCSR systems. In this paper, in conjunction with the previous respectable bilingual phoneme inventory construction techniques, we give a comprehensive investigation to the performance of MPE/fMPE on various Mandarin-English...

chapter

Improving Automatic Speech Recognizer of Voice Search Using System Combination

Ta Li, Weiqun Xu, Jielin Pan, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 477 - 480

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

Voice search is the technology that enables users to access information using spoken queries. Automatic speech recognizer (ASR) is one of the key modules for voice search systems. However, the high error rate of the state-of-the-art large vocabulary continuous speech recognition (LVCSR) is the bottleneck for most voice search systems. In this paper, we first build a baseline system using language...

chapter

Improved Lattice-Based Confidence Measure for Speech Recognition via a Lattice Cutoff Procedure

Jie Gao, Qingwei Zhao, Ran Xu, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 473 - 476

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

This paper presents an improvement for confidence measure estimation as posterior probabilities on lattices in speech recognition. An observation is presented that nontarget regions, i.e. non-speech part of a spoken utterance, of different lengths may lead to different levels of over optimistic confidence measures. This may be problematic in obtaining a consistent rejection performance at the same...

chapter

Emotion Recognition and Conversion for Mandarin Speech

Yu Zhou, Jianping Zhang, Ling Wang, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 1 > 179 - 183

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

In this study, some research activities on expressive speech recognition and conversion will be introduced. A database consisting of five kinds of speech emotions (i.e. happiness, sadness, surprise, anger and neutral) is used. Not only those traditional features such as mfcc, plp, and pitch are studied, but also a new feature extraction method based on fisher's F-Ratio is proposed and reported. In...

chapter

Nonnative speech recognition based on bilingual model modification

Qingqing Zhang, Jielin Pan, Shui-duen Chan, Yonghong Yan

2009 IEEE International Conference on Fuzzy Systems > 110 - 114

2009 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)

This paper presents a novel bilingual model modification approach to improve nonnative speech recognition accuracy when the variations of accented pronunciations occur. Each state of baseline nonnative acoustic model is modified with several candidate states from the auxiliary acoustic model, which is trained on speakers' mother language. State mapping criterion and n-best candidates are investigated,...

chapter

Chinese Prosody Structure Prediction Based on Conditional Random Fields

Jingwei Sun, Jing Yang, Jianping Zhang, Yonghong Yan

2009 Fifth International Conference on Natural Computation > 3 > 602 - 606

2009 Fifth International Conference on Natural Computation (ICNC 2009)

In this paper, a novel statistical method based on conditional random fields (CRF) is proposed for hierarchical prosody structure prediction, which is a key module in speech synthesis systems. We will discuss how to build the prosody models for mandarin Chinese using conditional random fields in detail, including corpus preparation, feature selection, feature template design, model training and evaluation...

chapter

SVM Based Speaker Recognition Using Maximum a posteriori Linear Regression

Xiang Zhang, Qingwei Zhao, Yonghong Yan

2009 International Conference on Electronic Computer Technology > 438 - 442

2009 International Conference on Electronic Computer Technology. ICECT 2009

Maximum likelihood linear regression (MLLR) is a widely used technique for speaker adaptation in large vocabulary speech recognition system. Recently, using MLLR transforms as features for SVM based speaker recognition tasks has been proposed, achieving performance comparable to that obtained with cepstral features. In this paper, we focus on calculating the transforms based on a GMM universal background...

chapter

Using Eigenvoice Coefficients as Features in Speaker Recognition

Haipeng Wang, Qingwei Zhao, Yonghong Yan

2009 International Conference on Electronic Computer Technology > 262 - 266

2009 International Conference on Electronic Computer Technology. ICECT 2009

Eigenvoice speaker adaptation has been shown to be effective in recent years. In this paper, we propose to use eigenvoice coefficients as features for speaker recognition. We use a simplified version of probabilistic subspace adaptation (PSA) to estimate eigenvoice coefficients, and the coefficients are concatenated to construct supervectors of support vector machines. This approach significantly...

chapter

A Synchronous Method for Automatic Scoring of Language Learning

Bin Dong, Yonghong Yan

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 5

2008 6th International Symposium on Chinese Spoken Language Processing

In this paper, a synchronous method based on state graph is proposed to calculate the evaluation feature for automatic scoring in computer-assisted language learning (CALL). The posterior probabilities of states are selected as the main feature. The score of hypothesized phonemes and words are estimated using the information of corresponding states. Traditional systems use two passes and two different...

chapter

Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection

Jie Gao, Qingwei Zhao, Yonghong Yan, Jian Shao

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

This paper examines the system combination issue for syllable-confusion-network (SCN)-based Chinese spoken term detection (STD). System combination for STD usually leads to improvements in accuracy but suffers from increased index size or complicated index structure. This paper explores methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness...

chapter

Speaker Recognition using a Kind of Novel Phonotactic Information

Xiang Zhang, Xiang Xiao, Haipeng Wang, Hongbin Suo, more

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

In this paper, we present a new modeling approach for speaker recognition, which uses a kind of novel phonotactic information as the feature for S VM modeling. Gaussian mixture models (GMMs) have been proven extremely successful for text- independent speaker recognition. The GMM universal background model (UBM) is a speaker-independent model, each component of which can be considered to be modeling...

chapter

Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids

Ran Xu, Jielin Pan, Yonghong Yan

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

In order to alleviate the limitation of "state output probability conditional independence" assumption held by Hidden Markov models (HMMs) in speech recognition, a discriminative semi-parametric trajectory model was proposed in recent years, in which both means and variances in the acoustic models are modeled as time-varying variables. The time- varying information is modeled as a weighted...

chapter

Using Reference to Tune Language Model for Detection of Reading Miscues

Changliang Liu, Fuping Pan, Fengpei Ge, Bin Dong, more

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

For a reading tutor, the reference content which the reader reads is known beforehand. This apriori information is very important in automatic detection of reading miscues. This paper proposed two methods to incorporate the reference information into LVCSR framework to improve the performance of miscue detection. The two methods both tune the n-gram Language Model (LM) probabilities dynamically in...

Keywords:
SPEECH

Publication date

Set your own date range

Publication type

book (88)
article (4)

Keywords

SPEECH RECOGNITION (43)
HIDDEN MARKOV MODELS (36)
ACOUSTICS (35)
SPEECH PROCESSING (23)
TRAINING (20)
DECODING (18)
FEATURE EXTRACTION (17)
NATURAL LANGUAGE PROCESSING (16)
SUPPORT VECTOR MACHINES (14)
MATHEMATICAL MODEL (11)
ADAPTATION MODEL (10)
COMPUTATIONAL MODELING (10)
NOISE (10)
SPEAKER RECOGNITION (10)
ACCURACY (9)
CORRELATION (9)
MAXIMUM LIKELIHOOD ESTIMATION (9)
NOISE MEASUREMENT (9)
SIGNAL TO NOISE RATIO (9)
DATA MINING (8)
GAUSSIAN PROCESSES (8)
ESTIMATION (7)
MICROPHONES (7)
SPEECH ENHANCEMENT (7)
LATTICES (6)
PROBABILITY (6)
ROBUSTNESS (6)
TRAINING DATA (6)
COMPUTER AIDED INSTRUCTION (5)
CONTEXT (5)
DATABASES (5)
EQUATIONS (5)
GAUSSIAN MIXTURE MODEL (5)
GMM (5)
HARMONIC ANALYSIS (5)
LANGUAGE MODEL (5)
MEL FREQUENCY CEPSTRAL COEFFICIENT (5)
NIST (5)
SUPPORT VECTOR MACHINE (5)
AUTOMATIC SPEECH RECOGNITION (4)
DICTIONARIES (4)
DIRECTION-OF-ARRIVAL ESTIMATION (4)
EMOTION RECOGNITION (4)
ERROR ANALYSIS (4)
GRAMMAR (4)
LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (4)
LVCSR (4)
MUSIC (4)
NATURAL LANGUAGES (4)
POSTERIOR PROBABILITIES (4)
SPORT (4)
SVM (4)
VITERBI ALGORITHM (4)
VOCABULARY (4)
ACOUSTIC MODEL (3)
ACOUSTIC SIGNAL PROCESSING (3)
AUDIO SIGNAL PROCESSING (3)
AUDIO STREAM (3)
AUDIO STREAMING (3)
CEPSTRAL ANALYSIS (3)
COMPUTER ASSISTED LANGUAGE LEARNING SYSTEM (3)
DATA MODELS (3)
DELAY EFFECTS (3)
DETECTORS (3)
GAMES (3)
HUMANS (3)
INDEXES (3)
MANDARIN SPEECH (3)
MFCC (3)
NOISE REDUCTION (3)
READING TUTOR (3)
REGRESSION ANALYSIS (3)
REVERBERATION (3)
SIGNAL PROCESSING (3)
SMOOTHING METHODS (3)
SPEECH PRESENCE PROBABILITY (3)
SPEECH SOURCE LOCALIZATION (3)
SPEECH SYNTHESIS (3)
SPOKEN TERM DETECTION (3)
STANDARDS (3)
TEXT ANALYSIS (3)
TIME-FREQUENCY ANALYSIS (3)
TRANSFORMS (3)
VECTORS (3)
ACOUSTIC MEASUREMENTS (2)
AGGLUTINATIVE LANGUAGE (2)
ARRAY SIGNAL PROCESSING (2)
ARRAYS (2)
AUDIO ANALYSIS (2)
AUTOMATIC SCORING (2)
CALL (2)
CHINESE DIALOGUE SYSTEM (2)
CHINESE VOICE SEARCH DATA (2)
CLASSIFICATION ALGORITHMS (2)
CLOSED-FORM SOLUTION (2)
COMPUTER-ASSISTED LANGUAGE LEARNING (2)
CONFUSION NETWORK (2)
CONSTRAINTS (2)
CORRELATION METHODS (2)
more

INFONA - science communication portal

Search results for: Yonghong Yan

Commentator's Speech Extraction in Audio Stream of Sports Games

Applying Articulatory Features to Speech Emotion Recognition

An Mandarin Pronunciation Quality Assessment System Using Two Kinds of Acoustic Models

Language model adaptation using auto-induced semantic structures in a voice search system

A Keyword Spotting Based Sports Type Determination System

Automatic Detection of Pathological Voices Using GMM-SVM Method

Sample-Based Automatic Dictionary Generation for Keyword Spotting System

Investigations to Minimum Phone Error Training in Bilingual Speech Recognition

Improving Automatic Speech Recognizer of Voice Search Using System Combination

Improved Lattice-Based Confidence Measure for Speech Recognition via a Lattice Cutoff Procedure

Emotion Recognition and Conversion for Mandarin Speech

Nonnative speech recognition based on bilingual model modification

Chinese Prosody Structure Prediction Based on Conditional Random Fields

SVM Based Speaker Recognition Using Maximum a posteriori Linear Regression

Using Eigenvoice Coefficients as Features in Speaker Recognition

A Synchronous Method for Automatic Scoring of Language Learning

Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection

Speaker Recognition using a Kind of Novel Phonotactic Information

Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids

Using Reference to Tune Language Model for Detection of Reading Miscues

Filter options

Publication date

Publication type

Keywords

Journal

INFONA - science communication portal

Search results for: Yonghong Yan

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options