Search results for: Qingqing Zhang

Items from 1 to 11 out of 11 results

chapter

Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling

Meixu Song, Qingqing Zhang, Jielin Pan, Yonghong Yan

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 20 - 24

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In HMM/DNN automatic speech recognition (ASR) systems, the DNNs model the posterior probabilities for triphone states. However, triphone states are unevenly distributed. In this situation, the training algorithm tends to converge to a local optimum more related to states with rich data than states with poor data. Thus, the imbalance of the training data decreases the ASR performances, especially for...

chapter

Boosted Hybrid DNN/HMM System Based on Correlation-Generated Targets

Mengzhe Chen, Qingqing Zhang, Jielin Pan, Yonghong Yan

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 590 - 593

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

In current DNN/HMM hybrid systems, the DNN models are trained by the 1-of-V targets which are obtained by the Viterbi-based forced-alignment. The states are viewed as unrelated and isolated. In fact, some phonemes are acoustically similar. Especially for Chinese, as a tonal language, its number of similar pairs is quadrupled. To add the similarity information between states into the model training,...

chapter

Improving Korean LVCSR with Long-Time Temporal Patterns and an Extended Phoneme Set

Ji Xu, Zhen Zhang, Qingqing Zhang, Jielin Pan, more

2013 Fourth Global Congress on Intelligent Systems > 336 - 340

2013 Fourth Global Congress on Intelligent Systems (GCIS)

Korean is an agglutinative language, in which pronunciations are affected by long-term context. In this paper, the long-time temporal information is investigated to improve Korean LVCSR. TRAP-based MLP features, which are able to utilize the scattered acoustic information over several hundred milliseconds, are employed to obtain additional information besides the conventional cepstral features. In...

chapter

Improved lattice rescoring by using speech attributes in Large Vocabulary Continuous Speech Recognition systems

Xinglong Gao, Qingqing Zhang, Jielin Pan

2013 6th International Congress on Image and Signal Processing (CISP) > 1 > 143 - 147

2013 6th International Congress on Image and Signal Processing (CISP)

Acoustic modeling of Large Vocabulary Continuous Speech Recognition (LVCSR) system which is normally based on context-dependent phone is heavily limited by representative capability between transcriptions and corresponding variation of raw speech utterance. To describe this relationship more accurate, this paper presents an alternative strategy by which speech attributes are used to capture acoustic...

chapter

Web-Based Language Model Domain Adaptation for Real World Voice Retrieval

Mengzhe Chen, Qingqing Zhang, Zhichao Wang, Jielin Pan, more

2013 Ninth International Conference on Computational Intelligence and Security > 100 - 104

2013 Ninth International Conference on Computational Intelligence and Security (CIS)

This paper presents our recent work on the development of a real world voice retrieval system, which automatically updates language models for a specific domain with the latest web data. Two of the main difficult issues in handling this system are tackled in this paper. First, when people use voice retrieval systems, new created "hot words" are inputted as the keywords. In order to ensure...

chapter

Subset selection for articulatory feature based confidence measures

Yanqing Sun, Qingwei Zhao, Qingqing Zhang, Yu Zhou, more

Third International Workshop on Advanced Computational Intelligence > 549 - 553

2010 Third International Workshop on Advanced Computational Intelligence (IWACI 2010)

This paper reports our recent work on optimizing the AF (articulatory features) based confidence measures, and combining them with the traditional HMM-based confidence measures. Different articulatory properties are analyzed using a separate AF-based confidence calculation method proposed in this paper, and are observed to be both complementary and redundant. A more compact subset is chosen and assembled...

chapter

Improved modeling for F0 generation and V/U decision in HMM-based TTS

Qingqing Zhang, Frank Soong, Yao Qian, Zhijie Yan, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4606 - 4609

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

The HMM-based TTS can produce a highly intelligible and decent quality voice. However, sometimes the synthesized speech exhibits perceptibly annoying glitches due to F0 extraction errors in the training data and voiced/unvoiced swapping errors in F0 generation. In the conventional MSD based F0 modeling [10], the dual but incompatible two probabilistic spaces, the continuous probability density for...

chapter

Investigations to Minimum Phone Error Training in Bilingual Speech Recognition

Ran Xu, Qingqing Zhang, Jielin Pan, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 486 - 490

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

The great success of Minimum Phone Error (MPE) training criterion in mono-language large vocabulary continuous speech recognition (LVCSR) tasks motivates us to apply it to bilingual LVCSR systems. In this paper, in conjunction with the previous respectable bilingual phoneme inventory construction techniques, we give a comprehensive investigation to the performance of MPE/fMPE on various Mandarin-English...

chapter

Nonnative speech recognition based on bilingual model modification

Qingqing Zhang, Jielin Pan, Shui-duen Chan, Yonghong Yan

2009 IEEE International Conference on Fuzzy Systems > 110 - 114

2009 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)

This paper presents a novel bilingual model modification approach to improve nonnative speech recognition accuracy when the variations of accented pronunciations occur. Each state of baseline nonnative acoustic model is modified with several candidate states from the auxiliary acoustic model, which is trained on speakers' mother language. State mapping criterion and n-best candidates are investigated,...

chapter

Nonnative Speech Recognition Based on State-Level Bilingual Model Modification

Qingqing Zhang, Ta Li, Jielin Pan, Yonghong Yan

2008 Third International Conference on Convergence and Hybrid Information Technology > 2 > 1220 - 1225

2008 Third International Conference on Convergence and Hybrid Information Technology (ICCIT)

The performance of automatic speech recognition decreases drastically for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition via considering these great variations of accented pronunciations. Each state of baseline nonnative acoustic...

chapter

State-based bilingual model modification for nonnative speech recognition

Qingqing Zhang, Ta Li, Jielin Pan, Yonghong Yan

2008 International Conference on Audio, Language and Image Processing > 1300 - 1305

2008 International Conference on Audio, Language and Image Processing

The speech recognition accuracy has been observed to decrease for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition via considering these great variations of accented pronunciations. Each state of the baseline nonnative acoustic models...

Filter options

Keywords:
HIDDEN MARKOV MODELS
Publication type:
book

Publication date

Set your own date range

Keywords

SPEECH RECOGNITION (10)
ACOUSTICS (8)
SPEECH (8)
TRAINING (5)
ADAPTATION MODEL (3)
DATA MODELS (3)
FEATURE EXTRACTION (3)
PHRASE ERROR RATE (3)
TRAINING DATA (3)
ACCENTED PRONUNCIATION (2)
BASELINE NONNATIVE ACOUSTIC MODEL (2)
BIOLOGICAL SYSTEM MODELING (2)
CORRELATION (2)
DATABASES (2)
ERROR STATISTICS (2)
GRAMMAR-CONSTRAINED SPEECH RECOGNITION SYSTEM (2)
GRAMMARS (2)
MAXIMUM LIKELIHOOD ESTIMATION (2)
NATURAL LANGUAGE PROCESSING (2)
NONNATIVE SPEECH RECOGNITION (2)
SPEAKER RECOGNITION (2)
STATE MAPPING CRITERION (2)
ACOUSTIC SIGNAL PROCESSING (1)
ADAPTATION MODELS (1)
AF SET (1)
AF-BASED CONFIDENCE CALCULATION METHOD (1)
AGGLUTINATIVE LANGUAGE (1)
ARTICULATORY FEATURE BASED CONFIDENCE MEASURES (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUXILIARY ACOUSTIC MODEL (1)
BILINGUAL MODEL MODIFICATION APPROACH (1)
BILINGUAL PHONEME INVENTORY CONSTRUCTION (1)
BILINGUAL SPEECH RECOGNITION (1)
BLOCKBASED LANGUAGE MODEL (1)
COMPUTATIONAL MODELING (1)
CONTEXT (1)
CONTINUOUS PROBABILITY DENSITY (1)
CORRELATION-GENERATED TARGETS (1)
DETECTORS (1)
DICTIONARIES (1)
DISCRETE PROBABILITY (1)
DISCRIMINATIVE TRAINING (1)
DOMAIN-SPECIFIC LANGUAGE MODEL (1)
ENTERTAINMENT INDUSTRY (1)
ERROR ANALYSIS (1)
F0 EXTRACTION ERRORS (1)
F0 GENERATION (1)
FMPE (1)
FOREIGN LANGUAGE (1)
GAUSSIAN MIXTURES (1)
GAUSSIAN PROCESSES (1)
GRAMMAR (1)
HIDDEN MARKOV MODEL (1)
HM-M/DNN HYBRID (1)
HMM SYSTEM (1)
HMM-BASED CONFIDENCE MEASURES (1)
HMM-BASED TTS (1)
HYBRID DNN/HMM SYSTEM (1)
IN-VOCABULARY TEST (1)
KOREAN LVCSR (1)
LATTICES (1)
LIKELIHOOD BASED FRAME OCCUPANCY (1)
LONG-TIME TEMPORAL INFORMATION (1)
MANDARIN SPEECH RECOGNITION (1)
MANDARIN-ACCENTED (1)
MANDARIN-ENGLISH BILINGUAL TEST SETS (1)
MAP (1)
MINIMUM PHONE ERROR (1)
MINIMUM PHONE ERROR TRAINING (1)
MODEL MODIFICATION (1)
MONO-LANGUAGE LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (1)
MULTILINGUAL RECOGNITION (1)
N-BEST CANDIDATES (1)
NATURAL LANGUAGES (1)
NONNATIVE ACOUSTIC MODEL (1)
NONNATIVE SPEAKER (1)
OUT-OF-VOCABULARY TEST (1)
PHONEME SET (1)
PIECE-WISE CONTINUOUS F0 TRAJECTORY (1)
PROBABILISTIC LOGIC (1)
PROBABILISTIC SAMPLING (1)
REAL TIME FACTOR (1)
SET THEORY (1)
SPEECH PROCESSING (1)
SPEECH SYNTHESIS (1)
STATE MAPPING (1)
STATE-BASED BILINGUAL MODEL MODIFICATION (1)
STATE-LEVEL BILINGUAL MODEL MODIFICATION (1)
STATISTICAL ANALYSIS (1)
SUBSET SELECTION (1)
TEXT-TO-SPEECH SYNTHESIS (1)
TRAJECTORY (1)
UNDER-RESOURCED LANGUAGES (1)
V/U DECISION (1)
V/U DECISION MODEL (1)
VOCABULARY (1)
VOICE RETRIEVAL (1)
VOICING STRENGTH (1)
more

INFONA - science communication portal

Search results for: Qingqing Zhang

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options