Search results for: Jia Liu

Items from 1 to 11 out of 11 results

chapter

Lattice based transcription loss for end-to-end speech recognition

Jian Kang, Wei-Qiang Zhang, Jia Liu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

End-to-end speech recognition systems have been successfully implemented and have become competitive replacements for hybrid systems. A common loss function to train end-to-end systems is connectionist temporal classification (CTC). This method maximizes the log likelihood between the feature sequence and the associated transcription sequence. However there are some weaknesses with CTC training. The...

chapter

Gated recurrent units based hybrid acoustic models for robust speech recognition

Jian Kang, Wei-Qiang Zhang, Jia Liu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However the problem of exploding or vanishing gradients has limited their application. In recent years, long short-term memory RNNs (LSTM RNNs) have been proposed to solve this problem, and have achieved excellent results. However, because of the large size of LSTM RNNs, they more easily suffer from overfitting,...

chapter

Stacked bottleneck features for speaker verification

Yao Tian, Liang He, Jia Liu

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 514 - 518

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

i-Vector modeling has shown to be effective for text independent speaker verification. It represents each utterance as a low-dimensional vector using factor analysis with a GMM supervector. In order to capture more complex speaker statistics, this paper proposes a new feature representation other than i-vectors for speaker verification using neural networks. In this work, stacked bottleneck features...

chapter

The THUEE system for the openKWS14 keyword search evaluation

Meng Cai, Zhiqiang Lv, Beili Song, Yongzhe Shi, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4734 - 4738

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The OpenKWS14 keyword search evaluation is one of the most challenging and influential evaluations in the field of speech recognition. Its goal is to build a high-performance keyword search system for a minority language with limited training data in a short period of time. We present the system of the Department of Electronic Engineering, Tsinghua University (THUEE team) for the OpenKWS14 keyword...

chapter

THUEE system for the Albayzin 2012 language recognition evaluation

Weiwei Liu, Wei-Qiang Zhang, Liang He, Jiaming Xu, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 109 - 112

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Albayzin 2012 language recognition evaluation (LRE) is one of the most challenging language recognition evaluation, which is mainly reflected in: (1) the target languages are more confusable with other languages, which might push down the system performance; (2) developing and test data is heterogeneous regarding duration, number of speakers, ambient noise/music, channel conditions, etc. (3) signals...

chapter

Improving deep neural network acoustic models using unlabeled data

Meng Cai, Wei-Qiang Zhang, Jia Liu

2013 IEEE China Summit and International Conference on Signal and Information Processing > 137 - 141

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled...

chapter

Phone modeling and combining discriminative training for mandarinenglish bilingual speech recognition

Yanmin Qian, Jia Liu

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4918 - 4921

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Automatic multilingual speech recognition is always a difficult task. This paper presents recent work on the development of a Mandarin-English bilingual speech recognition system. A unified single set of bilingual acoustic models based on a novel State-Time-Alignment (STA) method is proposed to balance the performance and the complexity of the bilingual speech recognition system, and a comparison...

chapter

Feature Selection Based on Mutual Information for Language Recognition

Yan Deng, Jia Liu

2009 2nd International Congress on Image and Signal Processing > 1 - 4

2009 2nd International Congress on Image and Signal Processing (CISP)

The prevailing system for language recognition is the parallel phoneme recognition followed by vector space modeling (PPRVSM), which uses a vector space model to describe the cooccurrence information of phones. As the super-vectors are composed of phonetic N-Grams, so for high dimension vectors, there is a problem that the number of N-Grams grows exponentially as the order N increases, which will...

chapter

Research on detection algorithm of multi-class telephone signal tones

Shan Zhong, Weiqiang Zhang, Jia Liu

2008 International Conference on Audio, Language and Image Processing > 697 - 700

2008 International Conference on Audio, Language and Image Processing

This paper not only proposes the detection algorithm of traditional ring signal tones, but also researches the Color Ring Back Tone which is well popular recent years. Different testing methods are given according to the difference of music color ring tone and voice prompt color ring tone. Tested on RMTS (Real-world Multi-channel Telephone Speech) database, experiments show that the detection rates...

chapter

Automatic language identification using support vector machines and phonetic N-gram

Yan Deng, Jia Liu

2008 International Conference on Audio, Language and Image Processing > 71 - 74

2008 International Conference on Audio, Language and Image Processing

In this paper, we describe two approaches for language identification (LID) using support vector machines (SVM) and phonetic n-gram. One is to use the language model scores of phone sequences to do SVM training. The other is to use the n-gram probabilities of those phones to train SVM models. For the second approach, we propose a new effective normalization method. In the experiments of 30 s test...

chapter

The application of discriminative training techniques in LID system fusion

Tao Hou, Weiqiang Zhang, Jia Liu

2008 International Conference on Audio, Language and Image Processing > 1457 - 1460

2008 International Conference on Audio, Language and Image Processing

This paper reports an approach to language identification (LID) system fusion using discriminative training. Maximum mutual information (MMI) training for Gaussian mixture model is introduced to the standard LDA-GMM fusion framework. Experimental results show that the proposed fusion scheme outperforms the maximum likelihood (ML) trained backend of LID system. The impact of number of Gaussian mixtures...

Filter options

Keywords:
TRAINING
SPEECH RECOGNITION

Publication date

Set your own date range

Keywords

SPEECH (9)
ACOUSTICS (5)
HIDDEN MARKOV MODELS (5)
DATA MODELS (3)
DEEP NEURAL NETWORK (3)
NEURAL NETWORKS (3)
SUPPORT VECTOR MACHINES (3)
TRAINING DATA (3)
ACOUSTIC MODELING (2)
LATTICES (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
NATURAL LANGUAGE PROCESSING (2)
NIST (2)
SUPPORT VECTOR MACHINE CLASSIFICATION (2)
ACOUSTIC SIGNAL DETECTION (1)
ACOUSTIC SIGNAL PROCESSING (1)
ALBAYZIN 2012 LANGUAGE RECOGNITION EVALUATION (LRE) (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATIC LANGUAGE IDENTIFICATION (1)
AUTOMATIC MULTILINGUAL SPEECH RECOGNITION (1)
BILINGUAL SPEECH RECOGNITION (1)
BIOLOGICAL NEURAL NETWORKS (1)
BOTTLENECK FEATURE (1)
CLUSTERING METHODS (1)
COLOR (1)
COLOR RING BACK TONE (1)
COMPUTER ARCHITECTURE (1)
CONNECTIONIST TEMPORAL CLASSIFICATION (1)
DATA SPARSENESS (1)
DEPARTMENT OF ELECTRONIC ENGINEERING (1)
DETECTION ALGORITHMS (1)
DISCRIMINATIVE TRAINING (1)
DISCRIMINATIVE TRAINING TECHNIQUES (1)
EDUCATIONAL INSTITUTIONS (1)
END-TO-END SYSTEM (1)
EQUAL ERROR RATE (1)
ERROR ANALYSIS (1)
ERROR STATISTICS (1)
FEATURE EXTRACTION (1)
FEATURE SELECTION ALGORITHM (1)
FMPE (1)
GATED RECURRENT UNITS (1)
GAUSSIAN MIXTURE MODEL (1)
GAUSSIAN PROCESSES (1)
GMM SUPERVECTOR (1)
KERNEL (1)
KEYWORD SEARCH (1)
KEYWORD SPOTTING (1)
LANGUAGE IDENTIFICATION SYSTEM FUSION (1)
LANGUAGE MODELING (1)
LANGUAGE RECOGNITION (1)
LATTICE (1)
LOGIC GATES (1)
LONG SHORT-TERM MEMORY (1)
LOW-RESOURCE (1)
MANDARIN ENGLISH (1)
MAXIMUM LIKELIHOOD ESTIMATION (1)
MAXIMUM LIKELIHOOD TRAINING (1)
MAXIMUM MUTUAL INFORMATION (1)
MAXIMUM RELEVANCE CRITERIA (1)
MICROPROCESSORS (1)
MINIMUM PHONE ERROR (1)
MPE (1)
MULTICLASS TELEPHONE SIGNAL TONE DETECTION ALGORITHM (1)
MUSIC (1)
MUSIC COLOR RING TONE (1)
MUTUAL INFORMATION (1)
N-GRAM PROBABILITY (1)
NOISE MEASUREMENT (1)
PARALLEL PHONEME RECOGNITION (1)
PHONE CLUSTERING (1)
PHONE MODELING (1)
PHONE SEQUENCE (1)
PHONETIC N-GRAM (1)
PHONETIC N-GRAMS (1)
REAL-WORLD MULTICHANNEL TELEPHONE SPEECH DATABASE (1)
REDUNDANCY (1)
ROBUST SPEECH RECOGNITION (1)
SENSOR FUSION (1)
SPEAKER VERIFICATION (1)
SPEECH PROCESSING (1)
STATE TIME ALIGNMENT (1)
SUPER VECTORS (1)
SUPPORT VECTOR MACHINE (1)
SVM TRAINING (1)
SWITCHES (1)
TESTING (1)
TRANSCRIPTION LOSS (1)
TSINGHUA UNIVERSITY (THUEE) (1)
UNLABELED DATA (1)
VECTOR SPACE MODELING (1)
VOICE PROMPT COLOR RING TONE (1)
more

INFONA - science communication portal

Search results for: Jia Liu

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options