Search results for: Yong QIN

Items from 1 to 12 out of 12 results

chapter

Wake-up-word spotting using end-to-end deep neural network system

Shilei Zhang, Wen Liu, Yong Qin

2016 23rd International Conference on Pattern Recognition (ICPR) > 2878 - 2883

2016 23rd International Conference on Pattern Recognition (ICPR)

Deep neural networks (DNNs) have tremendously improved the performance of automatic speech recognition (ASR). On the other hand, end-to-end speech recognition system can achieve state-of-the-art performance using Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) and Connectionist Temporal Classification (CTC) method for unsegmented sequence data. In this paper, we therefor propose a lightweight...

chapter

Generating compound words with high order n-gram information in large vocabulary speech recognition systems

Jie Zhou, Qin Shi, Yong Qin

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5560 - 5563

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this work we concentrate on generating compound words with high order n-gram information for speech recognition. In most existing compound words generation methods, only bi-gram information is considered. They are successful for improving the performance of bi-gram models but doesn't work well in higher order n-gram cases. Since nowadays 3-gram and 4-gram language models are commonly used, here...

chapter

Improved Mandarin Keyword Spotting Using Confusion Garbage Model

Shilei Zhang, Zhiwei Shuang, Qin Shi, Yong Qin

2010 20th International Conference on Pattern Recognition > 3700 - 3703

2010 20th International Conference on Pattern Recognition (ICPR 2010)

This paper presents an improved acoustic keyword spotting (KWS) algorithm using a novel confusion garbage model in Mandarin conversational speech. Observing the KWS corpus, we found there are many words with similar pronunciation with predefined keywords, although they have different Chinese characters and different meanings, which easily result in high false alarm rate. In this paper, an improved...

chapter

Automatic Pronunciation Transliteration for Chinese-English Mixed Language Keyword Spotting

Shilei Zhang, Zhiwei Shuang, Yong Qin

2010 20th International Conference on Pattern Recognition > 1610 - 1613

2010 20th International Conference on Pattern Recognition (ICPR 2010)

This paper presents automatic pronunciation transliteration method with acoustic and contextual analysis for Chinese-English mixed language keyword spotting (KWS) system. More often, we need to develop robust Chinese-English mixed language spoken language technology without Chinese accented English acoustic data. In this paper, we exploit pronunciation conversion method based on syllable-based characteristic...

chapter

Modeling Syllable-Based Pronunciation Variation for Accented Mandarin Speech Recognition

Shilei Zhang, Qin Shi, Yong Qin

2010 20th International Conference on Pattern Recognition > 1606 - 1609

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Pronunciation variation is a natural and inevitable phenomenon in an accented Mandarin speech recognition application. In this paper, we integrate knowledge-based and data-driven approaches together for syllable-based pronunciation variation modeling to improve the performance of Mandarin speech recognition system for speakers with Southern accent. First, we generate the syllable-based pronunciation...

chapter

The 2009 IBM GALE Mandarin broadcast transcription system

Stephen M Chu, Daniel Povey, Hong-Kwang Kuo, Lidia Mangu, more

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4374 - 4377

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three...

chapter

Chinese prosodic phrasing with the source-channel model

Honghui Dong, Yong Qin, Limin Jia

2009 Chinese Control and Decision Conference > 6168 - 6171

2009 Chinese Control and Decision Conference (CCDC 2009)

The prosodic phrasing is a classic problem in nature language process, which is not only useful for text-to-speech(TTS), but for speech recognition, statistic machine learning etc.. This paper introduces and discusses the source-channel model for Chinese prosodic phrasing. Based on the basic idea, the hidden Markov model (HMM) and the improved source-channel model are both used to describe the phrasing...

chapter

A phrase-level piecewise linear scaling algorithm for melody match in Query-by-Humming systems

Wenxiao Cao, Dan-ning Jiang, Jue Hou, Yong Qin, more

2009 IEEE International Conference on Multimedia and Expo > 942 - 945

2009 IEEE International Conference on Multimedia and Expo (ICME)

The Query-by-Humming (QBH) system allows users to retrieve songs by singing/humming. In this paper we propose a phrase-level piecewise linear scaling algorithm for melody match. Musical phrase boundaries are predicted for the query to split it to phrases. The boundaries of melody fragment corresponding to each phrase are allowed for adjusting in a limited scope. The algorithm employs Dynamic Programming...

chapter

Utterance verification using improved confidence measures based on alignment confusion rate in Chinese digits recognition

Shilei Zhang, Danning Jiang, Yong Qin

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 1309 - 1312

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we explore an approach to improved confidence measures based on a novel alignment confusion rate (ACR) which integrates alignment information from two different modeling unit sets in Chinese digits recognition system. Both initial-final (IF) phone set and head-body-tail (HBT) models have proven to obtain good recognition performance for connected digit strings. These two different modeling...

chapter

Main vowel domain tone modeling with lexical and prosodic analysis for Mandarin ASR

Shilei Zhang, Qin Shi, S.M. Chu, Yong Qin

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4561 - 4564

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

The tone is a distinctive discriminative feature in Mandarin Chinese. Often functional, yet seldom thorough are most large-scale Mandarin speech recognition systems in treating tone modeling. In particular, many lack the necessary sophistication to deal with the myriad variations arising from the combination of acoustic and lexical contexts. This paper reports an attempt to account for these variabilities...

chapter

Recent advances in the IBM GALE Mandarin transcription system

S.M. Chu, Hong-kwang Kuo, L. Mangu, Yi Liu, more

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4329 - 4332

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

This paper describes the system and algorithmic developments in the automatic transcription of Mandarin broadcast speech made at IBM in the second year of the DARPA GALE program. Technical advances over our previous system include improved acoustic models using embedded tone modeling, and a new topic-adaptive language model (LM) rescoring technique based on dynamically generated LMs. We present results...

chapter

The IBM Mandarin Broadcast Speech Transcription System

Stephen M. Chu, Hong-kwang Kuo, Yi Y Liu, Yong Qin, more

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '7 > 2 > II-345 - II-348

2007 IEEE International Conference on Acoustics, Speech, and Signal Processing

This paper describes the technical and system building advances in the automatic transcription of Mandarin broadcast speech made at IBM in the first year of the DARPA GALE program. In particular, we discuss the application of minimum phone error (MPE) discriminative training and a new topic-adaptive language modeling technique. We present results on both the RT04 evaluation data and two larger community-defined...

Filter options

Keywords:
SPEECH RECOGNITION

Publication date

Set your own date range

Keywords

HIDDEN MARKOV MODELS (6)
NATURAL LANGUAGE PROCESSING (6)
SPEECH (6)
ACOUSTICS (4)
SPEECH PROCESSING (4)
TRAINING (4)
COMPUTATIONAL MODELING (3)
LATTICES (3)
VOCABULARY (3)
ACCURACY (2)
ADAPTATION MODEL (2)
CHARACTER ERROR RATE (2)
CONFIDENCE MEASURE (2)
DARPA GALE PROGRAM (2)
KEYWORD SPOTTING (2)
PROBABILITY (2)
RHYTHM (2)
ACCENTED MANDARIN (1)
ACCENTED MANDARIN SPEECH RECOGNITION (1)
ACOUSTIC ANALYSIS (1)
ACOUSTIC CONTEXT (1)
ACOUSTIC KEYWORD SPOTTING (1)
ACOUSTIC KWS METHOD (1)
ACOUSTIC MODEL (1)
ACOUSTIC MODELING APPROACH (1)
ACOUSTIC MODELS (1)
ACOUSTIC SIGNAL PROCESSING (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ALGORITHMIC DEVELOPMENTS (1)
ALIGNMENT CONFUSION RATE (1)
ALIGNMENT STATISTICS (1)
ANALYTICAL MODELS (1)
ARTIFICIAL INTELLIGENCE (1)
AUTOMATIC PRONUNCIATION TRANSLITERATION (1)
AUTOMATIC TRANSCRIPTION (1)
BROADCAST CONVERSATION DOMAIN (1)
CAR-KIT MICROPHONE (1)
CFRN (1)
CHINESE CHARACTERS (1)
CHINESE DIGITS RECOGNITION (1)
CHINESE LINGUISTIC EXPERT (1)
CHINESE PROSODIC PHRASING (1)
CHINESE-ENGLISH MIXED LANGUAGE KEYWORD SPOTTING (1)
COMPOUND WORDS (1)
COMPUTATIONAL LINGUISTICS (1)
COMPUTER ARCHITECTURE (1)
CONFUSION GARBAGE MODEL (1)
CONNECTED-DIGITS SET (1)
CONSORTIUM DEFINED TEST SETS (1)
CONTEXT (1)
CONTEXT MODELING (1)
CONTEXT-DEPENDENT MODEL (1)
CONTEXTUAL ANALYSIS (1)
CONVERSATIONAL TELEPHONE DATASET (1)
CTC (1)
DATA-DRIVEN APPROACH (1)
DATABASES (1)
DECISION TREE (1)
DECISION TREES (1)
DECODING (1)
DICTIONARIES (1)
DISCRIMINATIVE TRAINING (1)
DYNAMIC PROGRAMMING (1)
EMBEDDED TONE MODELING (1)
ENTROPY (1)
EXPANSION DICTIONARY (1)
FALSE ALARM RATE (1)
FEATURE EXTRACTION (1)
FOOT (1)
FOOT-PATTERN MODEL (1)
FRAME RATE NORMALIZATION (1)
GAUSSIAN PROCESSES (1)
GRADIENT CRITERION (1)
HEAD-BODY-TAIL MODELS (1)
HETEROJUNCTION BIPOLAR TRANSISTORS (1)
HEURISTIC ALGORITHMS (1)
HIDDEN MARKOV MODEL (1)
HIGH ORDER (1)
HISTOGRAMS (1)
HMM (1)
HMM-BASED CM METHOD (1)
HMM-BASED CONFIDENCE MEASURE METHOD (1)
IBM GALE MANDARIN BROADCAST TRANSCRIPTION SYSTEM (1)
IBM GALE MANDARIN TRANSCRIPTION SYSTEM (1)
IBM MANDARIN BROADCAST TRANSCRIPTION SYSTEM (1)
INITIAL-FINAL PHONE SET (1)
KNOWLEDGE-BASED APPROACH (1)
LANGUAGE TRANSLATION (1)
LARGE-VOCABULARY BROADCAST TRANSCRIPTION (1)
LATTICE RESCORING (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LEXICAL ANALYSIS (1)
LEXICAL CONTEXT (1)
LINGUISTICS (1)
LOGIC GATES (1)
LSTM (1)
MAIN VOWEL (1)
MAIN VOWEL DOMAIN TONE MODELING (1)
MANDARIN ASR (1)
more

INFONA - science communication portal

Search results for: Yong QIN

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options