Search results for: Jia Liu

Items from 1 to 20 out of 37 results

chapter

An LSTM-CTC based verification system for proxy-word based OOV keyword search

Zhiqiang Lv, Jian Kang, Wei-Qiang Zhang, Jia Liu

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5655 - 5659

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Proxy-word based out of vocabulary (OOV) keyword search has been proven to be quite effective in keyword search. In proxy-word based OOV keyword search, each OOV keyword is assigned several proxies and detections of the proxies are regarded as detections of the OOV keywords. However, the confidence scores of these detections are still those of the proxies from lattices. To obtain a better confidence...

chapter

A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction

Cong Guo, Like Hui, Wei-Qiang Zhang, Jia Liu

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 6 - 10

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

Computational auditory scene analysis (CASA) system is well used in speech enhancement area in recent years. We propose a new system that combines CASA and spectral subtraction to get better enhanced speech. The CASA part consists of the latest method deep neural networks (DNNs). The original way to reconstruct the denoise signal is to use the estimated masks with direct overlap-add method ignoring...

chapter

Application of i-vector in speech and music classification

Hao Zhang, Xu-Kui Yang, Wei-Qiang Zhang, Wen-Lin Zhang, more

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 1 - 5

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper proposes a speech/music classification system based on i-vector. An analysis of two classification methods, namely cosine distance score (CDS) and support vector machine (SVM) is performed. Two session compensation methods, within-class covariance normalization (WCCN) and linear discriminant analysis (LDA) are also discussed. The performance of proposed systems yields better results compared...

chapter

A scheme discriminating between synthetic speech and normal speech

Jilun Chen, Weiqiang Zhang, Jia Liu

2016 International Conference on Audio, Language and Image Processing (ICALIP) > 683 - 688

2016 International Conference on Audio, Language and Image Processing (ICALIP)

This paper develops a system to automatically distinguish natural speech from synthetic speech. The issue of feature selection is considered. We take commonly used feature Mel-Frequency Cepstrum Coefficient (MFCC) in consideration, as well as other features such as Relative Phase Shift (RPS) and pitch tuned for Automatically Speech Recognition (ASR). We found some features are complimentary in the...

chapter

Calibration of word posterior estimation in confusion networks for keyword search

Zhiqiang Lv, Meng Cai, Wei-Qiang Zhang, Jia Liu

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 148 - 151

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Word posterior probability has been widely used as the confidence estimation of automatic speech recognition (ASR) systems and has been proved to be quite effective in related applications such as keyword search. However, word posterior probability tends to overestimate the true probability of a hypothesis, as it is computed on a subset of the total hypothesis space. In this paper, we show that a...

chapter

Convolutional maxout neural networks for speech separation

Like Hui, Meng Cai, Cong Guo, Liang He, more

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 24 - 27

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the...

chapter

Stacked bottleneck features for speaker verification

Yao Tian, Liang He, Jia Liu

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 514 - 518

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

i-Vector modeling has shown to be effective for text independent speaker verification. It represents each utterance as a low-dimensional vector using factor analysis with a GMM supervector. In order to capture more complex speaker statistics, this paper proposes a new feature representation other than i-vectors for speaker verification using neural networks. In this work, stacked bottleneck features...

chapter

The THUEE system for the openKWS14 keyword search evaluation

Meng Cai, Zhiqiang Lv, Beili Song, Yongzhe Shi, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4734 - 4738

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The OpenKWS14 keyword search evaluation is one of the most challenging and influential evaluations in the field of speech recognition. Its goal is to build a high-performance keyword search system for a minority language with limited training data in a short period of time. We present the system of the Department of Electronic Engineering, Tsinghua University (THUEE team) for the OpenKWS14 keyword...

chapter

THUEE system for the Albayzin 2012 language recognition evaluation

Weiwei Liu, Wei-Qiang Zhang, Liang He, Jiaming Xu, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 109 - 112

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Albayzin 2012 language recognition evaluation (LRE) is one of the most challenging language recognition evaluation, which is mainly reflected in: (1) the target languages are more confusable with other languages, which might push down the system performance; (2) developing and test data is heterogeneous regarding duration, number of speakers, ambient noise/music, channel conditions, etc. (3) signals...

chapter

Improve low-resource non-native mispronunciation detection with native speech by articulatory-based tandem feature

Hua Yuan, Ji Xu, Junhong Zhao, Jia Liu

2013 IEEE China Summit and International Conference on Signal and Information Processing > 127 - 131

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In this paper, we propose a method to improve detecting the mispronunciation type of the non-native learners. In order to cope with the low-resource condition of non-native speech and the difference of native and non-native speech, the following efforts are made: 1) train acoustic model with the low-resource non-native data; 2) introduce the articulatory-based tandem feature; 3) pool auxiliary native...

chapter

Improving deep neural network acoustic models using unlabeled data

Meng Cai, Wei-Qiang Zhang, Jia Liu

2013 IEEE China Summit and International Conference on Signal and Information Processing > 137 - 141

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled...

chapter

Automatic pitch accent detection using auto-context with acoustic features

Junhong Zhao, Wei-Qiang Zhang, Hua Yuan, Jia Liu, more

2012 8th International Symposium on Chinese Spoken Language Processing > 247 - 251

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In prosody event detection field, many local acoustic features have been proposed for representing the prosody characteristics of speech unit. The context information that represents some possible regularities underlying neighboring prosody events, however, hasn't been used effectively. The main difficulty to utilize prosodic context is that it's hard to capture the long-distance sequential dependency...

chapter

Improve mispronunciation detection with Tandem feature

Hua Yuan, Junhong Zhao, Jia Liu

2012 8th International Symposium on Chinese Spoken Language Processing > 184 - 187

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

This paper presents a method to improve the mispronunciation detection performance for low-resource acoustic model. The 1h speech data is randomly selected from CU-CHLOE to imitate the low-resource non-native English situation. The Tandem feature derived from articulatory based Multi-Layer Perception (MLP) is employed to replace the traditional spectral feature (e.g. PLP). Further, motivated by similar...

chapter

Speaker classification based on high dimension feature vector

Yi Yang, Hui Song, Jia Liu

2011 Seventh International Conference on Natural Computation > 2 > 891 - 894

2011 Seventh International Conference on Natural Computation (ICNC)

Audio index is an important part of NIST-RT-SD evaluation since 2003. Speaker Diarization is one kind of audio index technology which is marked by different speakers. One essential component of speaker diarization is speaker clustering which is always the pre-processing of speech recognition. The general method is to extract acoustic feature such as LPCC or MFCC and achieve some model such as HMM...

article

Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition

Wei-Qiang Zhang, Liang He, Yan Deng, Jia Liu, more

IEEE Transactions on Audio, Speech, and Language Processing > 2011 > 19 > 2 > 266 - 276

The shifted delta cepstrum (SDC) is a widely used feature extraction for language recognition (LRE). With a high context width due to incorporation of multiple frames, SDC outperforms traditional delta and acceleration feature vectors. However, it also introduces correlation into the concatenated feature vector, which increases redundancy and may degrade the performance of backend classifiers. In...

chapter

Multi-feature combination for speaker recognition

Zhi-Yi Li, Liang He, Wei-Qiang Zhang, Jia Liu

2010 7th International Symposium on Chinese Spoken Language Processing > 318 - 321

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

Combination of different features has been proved to be a good method for improving performance in speech recognition. In speaker recognition (SRE), various features have also been developed to reflect complementary aspects of speaker's characteristics. This paper proposed an effective multi-feature combination in speaker recognition. In order to avoid the “dimensionality disaster” and to delimit...

chapter

A robust algorithm of double talk detection based on voice activity detection

Hanbo Bao, Yi Yang, Jia Liu, Xiuguo Bao, more

2010 International Conference on Audio, Language and Image Processing > 12 - 15

2010 International Conference on Audio, Language and Image Processing (ICALIP)

Double talk detection is used in acoustic echo cancellation system to keep adaptive filter from divergence. This paper describes a new real-time double talk detention algorithm. Voice activity detection algorithm is used to detect the point end of each speech. And then the algorithm uses a logic unit to detected double talk of dialogue. The new algorithm presented in this paper has robustness against...

chapter

Perturbation analysis of mel-frequency cepstrum coefficients

Wei-Qiang Zhang, Dengzhou Yang, Jia Liu, Xiuguo Bao

2010 International Conference on Audio, Language and Image Processing > 715 - 718

2010 International Conference on Audio, Language and Image Processing (ICALIP)

Mel-frequency cepstrum coefficient (MFCC) is a widely used feature vector in speech signal precessing. Its feature extraction procedure can be seen as a mapping function which transfers the input speech signals to output MFCC feature vectors. However, this function is too complex to analyze and even a simple approximation is not easy to obtain. This paper studies the effects of each MFCC feature extraction...

chapter

A modified Subband post-filtering approach for MVDR beamformer

Yi Yang, Hui Song, Jia Liu, Xiuguo Bao, more

9th IEEE International Conference on Cognitive Informatics (ICCI'10) > 880 - 883

2010 9th IEEE International Conference on Cognitive Informatics (ICCI)

MVDR beamformer is a robust beamforming method to enhance a desired (speech) signal in the presence of stationary noise. This paper presents a modified Subband post-filtering approach for MVDR beamformer in microphone array system. The quality of the modified Subband post-filtering is studied in simulated rooms with different noise level and is compared to wiener post-filtering proposed in the literature...

chapter

A CMLLR supervector kernel for SVM language recognition

Shan Zhong, Jia Liu

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4998 - 5001

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper explores the use of constrained maximum likelihood linear regression (CMLLR) transforms as features for language recognition. Modeling is carried out through support vector machine (SVM). This work proposes a novel CMLLR supervector kernel. Results on the NIST LRE09 task show that feature-domain CMLLR transforms contain more language dependent information than model-domain MLLRs, and the...

Data set:
ieee
Keywords:
SPEECH

Publication date

Set your own date range

Publication type

book (34)
article (3)

Keywords

TRAINING (18)
SPEECH RECOGNITION (17)
FEATURE EXTRACTION (16)
ACOUSTICS (8)
SUPPORT VECTOR MACHINES (8)
NOISE (6)
SPEECH PROCESSING (6)
GAUSSIAN PROCESSES (5)
HIDDEN MARKOV MODELS (5)
KERNEL (5)
NEURAL NETWORKS (5)
NIST (5)
SPEAKER RECOGNITION (5)
SPEECH ENHANCEMENT (5)
SUPPORT VECTOR MACHINE CLASSIFICATION (5)
DATA MODELS (4)
MEL FREQUENCY CEPSTRAL COEFFICIENT (4)
MICROPHONES (4)
NATURAL LANGUAGE PROCESSING (4)
TRAINING DATA (4)
ACOUSTIC MODELING (3)
ADAPTATION MODEL (3)
ALGORITHM DESIGN AND ANALYSIS (3)
ARTIFICIAL NEURAL NETWORKS (3)
COVARIANCE MATRIX (3)
DEEP NEURAL NETWORK (3)
KEYWORD SEARCH (3)
LANGUAGE IDENTIFICATION (3)
LANGUAGE RECOGNITION (3)
LATTICES (3)
SIGNAL TO NOISE RATIO (3)
ACCURACY (2)
ACOUSTIC SIGNAL DETECTION (2)
ACOUSTIC SIGNAL PROCESSING (2)
ADAPTATION MODELS (2)
ADAPTIVE FILTERS (2)
ARRAYS (2)
ARTICULATORY FEATURE (2)
BAYES METHODS (2)
BAYESIAN INFORMATION CRITERION (2)
DATABASES (2)
DETECTION ALGORITHMS (2)
DISCRETE COSINE TRANSFORMS (2)
DISTANCE MEASUREMENT (2)
ECHO SUPPRESSION (2)
EQUAL ERROR RATE (2)
FILTER BANK (2)
FREQUENCY-DOMAIN ANALYSIS (2)
GAUSSIAN MIXTURE MODEL (2)
GMM (2)
HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS (2)
HLDA (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
LINEAR DISCRIMINANT ANALYSIS (2)
MAXIMUM LIKELIHOOD ESTIMATION (2)
MEL-FREQUENCY CEPSTRUM COEFFICIENT (2)
MICROPHONE ARRAYS (2)
MUSIC (2)
MUTUAL INFORMATION (2)
NATURAL LANGUAGES (2)
NOISE MEASUREMENT (2)
SIGNAL PROCESSING ALGORITHMS (2)
SPEAKER VERIFICATION (2)
TANDEM FEATURE (2)
TESTING (2)
TIME-FREQUENCY ANALYSIS (2)
VECTORS (2)
ACOUSTIC (1)
ACOUSTIC CORRELATION (1)
ACOUSTIC ECHO CANCELLATION (1)
ADAPTIVE FILTER (1)
ALBAYZIN 2012 LANGUAGE RECOGNITION EVALUATION (LRE) (1)
ARRAY SIGNAL PROCESSING (1)
AUDIBILITY (1)
AUDITORY CEPSTRUM COEFFICIENT FEATURE EXTRACTION (1)
AUDITORY SYSTEM (1)
AUTO-CONTEXT (1)
AUTOMATIC LANGUAGE IDENTIFICATION (1)
AUTOMATIC MULTILINGUAL SPEECH RECOGNITION (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUTOMATIC SYLLABLE CONSTANT-VOWEL RATIO REGULATION (1)
BAND PASS FILTERS (1)
BAYESIAN METHODS (1)
BEAM-SEARCH (1)
BILINGUAL SPEECH RECOGNITION (1)
BIOLOGICAL NEURAL NETWORKS (1)
BLOCK DIAGONAL HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS (BDHLDA) (1)
BOTTLENECK FEATURE (1)
CALIBRATION (1)
CALL (1)
CEPSTRAL ANALYSIS (1)
CEPSTRUM (1)
CEPSTRUM MATRIX (1)
CHANNEL COMPENSATION (1)
CHANNEL COMPENSATION TECHNOLOGY (1)
CHANNEL VARIABILITY (1)
CHINESE PHONETICS KNOWLEDGE (1)
CLASSIFICATION ALGORITHMS (1)
CLUSTERING METHODS (1)
more

INFONA - science communication portal

Search results for: Jia Liu

An LSTM-CTC based verification system for proxy-word based OOV keyword search

A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction

Application of i-vector in speech and music classification

A scheme discriminating between synthetic speech and normal speech

Calibration of word posterior estimation in confusion networks for keyword search

Convolutional maxout neural networks for speech separation

Stacked bottleneck features for speaker verification

The THUEE system for the openKWS14 keyword search evaluation

THUEE system for the Albayzin 2012 language recognition evaluation

Improve low-resource non-native mispronunciation detection with native speech by articulatory-based tandem feature

Improving deep neural network acoustic models using unlabeled data

Automatic pitch accent detection using auto-context with acoustic features

Improve mispronunciation detection with Tandem feature

Speaker classification based on high dimension feature vector

Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition

Multi-feature combination for speaker recognition

A robust algorithm of double talk detection based on voice activity detection

Perturbation analysis of mel-frequency cepstrum coefficients

A modified Subband post-filtering approach for MVDR beamformer

A CMLLR supervector kernel for SVM language recognition

Filter options

Publication date

Publication type

Keywords

Journal

INFONA - science communication portal

Search results for: Jia Liu

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options