Wyniki wyszukiwania dla: QingQing Zhang

Pozycje od 1 do 9 spośród 9 wyników

rozdział

Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling

Meixu Song, Qingqing Zhang, Jielin Pan, Yonghong Yan

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 20 - 24

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In HMM/DNN automatic speech recognition (ASR) systems, the DNNs model the posterior probabilities for triphone states. However, triphone states are unevenly distributed. In this situation, the training algorithm tends to converge to a local optimum more related to states with rich data than states with poor data. Thus, the imbalance of the training data decreases the ASR performances, especially for...

rozdział

Boosted Hybrid DNN/HMM System Based on Correlation-Generated Targets

Mengzhe Chen, Qingqing Zhang, Jielin Pan, Yonghong Yan

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing > 590 - 593

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)

In current DNN/HMM hybrid systems, the DNN models are trained by the 1-of-V targets which are obtained by the Viterbi-based forced-alignment. The states are viewed as unrelated and isolated. In fact, some phonemes are acoustically similar. Especially for Chinese, as a tonal language, its number of similar pairs is quadrupled. To add the similarity information between states into the model training,...

rozdział

Improving Korean LVCSR with Long-Time Temporal Patterns and an Extended Phoneme Set

Ji Xu, Zhen Zhang, Qingqing Zhang, Jielin Pan, więcej

2013 Fourth Global Congress on Intelligent Systems > 336 - 340

2013 Fourth Global Congress on Intelligent Systems (GCIS)

Korean is an agglutinative language, in which pronunciations are affected by long-term context. In this paper, the long-time temporal information is investigated to improve Korean LVCSR. TRAP-based MLP features, which are able to utilize the scattered acoustic information over several hundred milliseconds, are employed to obtain additional information besides the conventional cepstral features. In...

rozdział

Improved lattice rescoring by using speech attributes in Large Vocabulary Continuous Speech Recognition systems

Xinglong Gao, Qingqing Zhang, Jielin Pan

2013 6th International Congress on Image and Signal Processing (CISP) > 1 > 143 - 147

2013 6th International Congress on Image and Signal Processing (CISP)

Acoustic modeling of Large Vocabulary Continuous Speech Recognition (LVCSR) system which is normally based on context-dependent phone is heavily limited by representative capability between transcriptions and corresponding variation of raw speech utterance. To describe this relationship more accurate, this paper presents an alternative strategy by which speech attributes are used to capture acoustic...

rozdział

Improved modeling for F0 generation and V/U decision in HMM-based TTS

Qingqing Zhang, Frank Soong, Yao Qian, Zhijie Yan, więcej

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4606 - 4609

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

The HMM-based TTS can produce a highly intelligible and decent quality voice. However, sometimes the synthesized speech exhibits perceptibly annoying glitches due to F0 extraction errors in the training data and voiced/unvoiced swapping errors in F0 generation. In the conventional MSD based F0 modeling [10], the dual but incompatible two probabilistic spaces, the continuous probability density for...

rozdział

Investigations to Minimum Phone Error Training in Bilingual Speech Recognition

Ran Xu, Qingqing Zhang, Jielin Pan, Yonghong Yan

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 486 - 490

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

The great success of Minimum Phone Error (MPE) training criterion in mono-language large vocabulary continuous speech recognition (LVCSR) tasks motivates us to apply it to bilingual LVCSR systems. In this paper, in conjunction with the previous respectable bilingual phoneme inventory construction techniques, we give a comprehensive investigation to the performance of MPE/fMPE on various Mandarin-English...

rozdział

Nonnative speech recognition based on bilingual model modification

Qingqing Zhang, Jielin Pan, Shui-duen Chan, Yonghong Yan

2009 IEEE International Conference on Fuzzy Systems > 110 - 114

2009 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)

This paper presents a novel bilingual model modification approach to improve nonnative speech recognition accuracy when the variations of accented pronunciations occur. Each state of baseline nonnative acoustic model is modified with several candidate states from the auxiliary acoustic model, which is trained on speakers' mother language. State mapping criterion and n-best candidates are investigated,...

rozdział

Spoken Term Detection Using Dynamic Match Subword Confusion Network

Jie Gao, Jian Shao, Qingqing Zhang, Qingwei Zhao, więcej

2008 Fourth International Conference on Natural Computation > 4 > 250 - 254

2008 Fourth International Conference on Natural Computation (ICNC)

This paper details our subword confusion network based approach for Mandarin spoken term detection. As well as the system description, two approaches are presented for improvement of our baseline system. To reduce the inherent high recognition error of the subword decoding system due to its weak language model constraints, the subword confusion network is proposed to be generated from the word decoding...

rozdział

State-based bilingual model modification for nonnative speech recognition

Qingqing Zhang, Ta Li, Jielin Pan, Yonghong Yan

2008 International Conference on Audio, Language and Image Processing > 1300 - 1305

2008 International Conference on Audio, Language and Image Processing

The speech recognition accuracy has been observed to decrease for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition via considering these great variations of accented pronunciations. Each state of the baseline nonnative acoustic models...

Opcje filtrowania

Słowa kluczowe:
SPEECH

Data publikacji

Ustaw własny zakres dat

Słowa kluczowe

HIDDEN MARKOV MODELS (8)
SPEECH RECOGNITION (8)
ACOUSTICS (7)
TRAINING (4)
ADAPTATION MODEL (2)
CORRELATION (2)
DETECTORS (2)
FEATURE EXTRACTION (2)
LATTICES (2)
NATURAL LANGUAGE PROCESSING (2)
PHRASE ERROR RATE (2)
TRAINING DATA (2)
ACCENTED PRONUNCIATION (1)
AGGLUTINATIVE LANGUAGE (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUXILIARY ACOUSTIC MODEL (1)
BASELINE NONNATIVE ACOUSTIC MODEL (1)
BILINGUAL MODEL MODIFICATION APPROACH (1)
BILINGUAL PHONEME INVENTORY CONSTRUCTION (1)
BILINGUAL SPEECH RECOGNITION (1)
BIOLOGICAL SYSTEM MODELING (1)
COMPUTATIONAL MODELING (1)
CONFUSION NETWORK (1)
CONTEXT (1)
CONTINUOUS PROBABILITY DENSITY (1)
CORRELATION-GENERATED TARGETS (1)
DATA MODELS (1)
DATABASES (1)
DECODING (1)
DICTIONARIES (1)
DISCRETE PROBABILITY (1)
DISCRIMINATIVE TRAINING (1)
DYNAMIC MATCH SUBWORD CONFUSION NETWORK (1)
ERROR ANALYSIS (1)
ERROR STATISTICS (1)
F0 EXTRACTION ERRORS (1)
F0 GENERATION (1)
FMPE (1)
FOREIGN LANGUAGE (1)
GAUSSIAN MIXTURES (1)
GAUSSIAN PROCESSES (1)
GRAMMAR (1)
GRAMMAR-CONSTRAINED SPEECH RECOGNITION SYSTEM (1)
GRAMMARS (1)
HEURISTIC ALGORITHMS (1)
HIDDEN MARKOV MODEL (1)
HM-M/DNN HYBRID (1)
HMM-BASED TTS (1)
HYBRID DNN/HMM SYSTEM (1)
KOREAN LVCSR (1)
LANGUAGE MODEL CONSTRAINTS (1)
LIKELIHOOD BASED FRAME OCCUPANCY (1)
LONG-TIME TEMPORAL INFORMATION (1)
MANDARIN SPEECH RECOGNITION (1)
MANDARIN-ENGLISH BILINGUAL TEST SETS (1)
MAXIMUM LIKELIHOOD ESTIMATION (1)
MINIMUM EDIT DISTANCE METHOD (1)
MINIMUM PHONE ERROR (1)
MINIMUM PHONE ERROR TRAINING (1)
MONO-LANGUAGE LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION (1)
MULTILINGUAL RECOGNITION (1)
N-BEST CANDIDATES (1)
NATURAL LANGUAGES (1)
NONNATIVE ACOUSTIC MODEL (1)
NONNATIVE SPEAKER (1)
NONNATIVE SPEECH RECOGNITION (1)
PHONEME SET (1)
PIECE-WISE CONTINUOUS F0 TRAJECTORY (1)
PROBABILISTIC LOGIC (1)
PROBABILISTIC SAMPLING (1)
REAL TIME FACTOR (1)
REAL TIME SYSTEMS (1)
RECOGNITION ERROR (1)
SPEAKER RECOGNITION (1)
SPEECH CODING (1)
SPEECH PROCESSING (1)
SPEECH SYNTHESIS (1)
SPOKEN TERM DETECTION (1)
STATE MAPPING (1)
STATE MAPPING CRITERION (1)
STATE-BASED BILINGUAL MODEL MODIFICATION (1)
STATISTICAL ANALYSIS (1)
SUBWORD DECODING SYSTEM (1)
TEXT-TO-SPEECH SYNTHESIS (1)
TRAJECTORY (1)
UNDER-RESOURCED LANGUAGES (1)
V/U DECISION (1)
V/U DECISION MODEL (1)
VOICING STRENGTH (1)
WORD PROCESSING (1)
więcej

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: QingQing Zhang

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu