Search results for: Jia Liu

Items from 1 to 20 out of 27 results

chapter

Ivec-PLDA-AHC priors for VB-HMM speaker diarization system

Liang He, Xianhong Chen, Can Xu, Tianyu Liang, more

2017 IEEE International Workshop on Signal Processing Systems (SiPS) > 1 - 6

2017 IEEE International Workshop on Signal Processing Systems (SiPS)

This paper proposes a hybrid speaker diarization system. The main body is a variational Bayes — hidden Markov model (VB-HMM) speaker diarization system. The VB-HMM speaker diarization system avoids making premature hard decision and takes advantages of soft speaker information in an iterative way. Thus, it outperforms most of mainstream speaker diarization systems. Unfortunately, this system is sensitive...

chapter

Hybrid-beamforming-based millimeter-wave cellular network optimization

Jia Liu, Elizabeth Bentley

2017 15th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt) > 1 - 8

2017 15th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt)

Massive MIMO and millimeter-wave communication (mmWave) have recently emerged as two key technologies for building 5G wireless networks and beyond. To reconcile the conflict between the large antenna arrays and the limited amount of radio-frequency (RF) chains in mmWave systems, the so-called hybrid beamforming becomes a promising solution and has received a great deal of attention in recent years...

chapter

An LSTM-CTC based verification system for proxy-word based OOV keyword search

Zhiqiang Lv, Jian Kang, Wei-Qiang Zhang, Jia Liu

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5655 - 5659

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Proxy-word based out of vocabulary (OOV) keyword search has been proven to be quite effective in keyword search. In proxy-word based OOV keyword search, each OOV keyword is assigned several proxies and detections of the proxies are regarded as detections of the OOV keywords. However, the confidence scores of these detections are still those of the proxies from lattices. To obtain a better confidence...

chapter

A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction

Cong Guo, Like Hui, Wei-Qiang Zhang, Jia Liu

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 6 - 10

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

Computational auditory scene analysis (CASA) system is well used in speech enhancement area in recent years. We propose a new system that combines CASA and spectral subtraction to get better enhanced speech. The CASA part consists of the latest method deep neural networks (DNNs). The original way to reconstruct the denoise signal is to use the estimated masks with direct overlap-add method ignoring...

chapter

Application of i-vector in speech and music classification

Hao Zhang, Xu-Kui Yang, Wei-Qiang Zhang, Wen-Lin Zhang, more

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 1 - 5

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper proposes a speech/music classification system based on i-vector. An analysis of two classification methods, namely cosine distance score (CDS) and support vector machine (SVM) is performed. Two session compensation methods, within-class covariance normalization (WCCN) and linear discriminant analysis (LDA) are also discussed. The performance of proposed systems yields better results compared...

chapter

Lattice based transcription loss for end-to-end speech recognition

Jian Kang, Wei-Qiang Zhang, Jia Liu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

End-to-end speech recognition systems have been successfully implemented and have become competitive replacements for hybrid systems. A common loss function to train end-to-end systems is connectionist temporal classification (CTC). This method maximizes the log likelihood between the feature sequence and the associated transcription sequence. However there are some weaknesses with CTC training. The...

chapter

Gated recurrent units based hybrid acoustic models for robust speech recognition

Jian Kang, Wei-Qiang Zhang, Jia Liu

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP) > 1 - 5

2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)

Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However the problem of exploding or vanishing gradients has limited their application. In recent years, long short-term memory RNNs (LSTM RNNs) have been proposed to solve this problem, and have achieved excellent results. However, because of the large size of LSTM RNNs, they more easily suffer from overfitting,...

chapter

Intrusion Detection Techniques Based on Improved Intuitionistic Fuzzy Neural Networks

Yang Lei, Jia Liu, Hongyan Yin

2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS) > 518 - 521

2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS)

At present, the issue of intrusion detection must be a hot point to all over the computer security area. In this paper, two novel intrusion detection techniques have been proposed. First, unlike the current existent detection methods, this paper combines the theories of both intuitionistic fuzzy sets (IFS) and artificial neural networks (ANN) together, which lead to much fewer iteration numbers, higher...

chapter

THUEE language modeling method for the OpenKWS 2015 evaluation

Zhuo Zhang, Wei-Qiang Zhang, Kai-Xiang Shen, Xu-Kui Yang, more

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 534 - 538

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

In this paper, we describe the THUEE (Department of Electronic Engineering, Tsinghua University) team's method of building language models (LMs) for the OpenKWS 2015 Evaluation held by the National Institute of Standards and Technology (NIST). Due to the very limited in-domain data provided by NIST, it takes most of our time and efforts to make good use of the out-of-domain data. There are three main...

chapter

Convolutional maxout neural networks for speech separation

Like Hui, Meng Cai, Cong Guo, Liang He, more

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 24 - 27

2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the...

chapter

Stacked bottleneck features for speaker verification

Yao Tian, Liang He, Jia Liu

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 514 - 518

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

i-Vector modeling has shown to be effective for text independent speaker verification. It represents each utterance as a low-dimensional vector using factor analysis with a GMM supervector. In order to capture more complex speaker statistics, this paper proposes a new feature representation other than i-vectors for speaker verification using neural networks. In this work, stacked bottleneck features...

chapter

A fast handwritten numeral recognition framework based on peak densities

He Zhang, Jia Liu, Zhengyan Liu, Nan Zhang, more

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) > 549 - 553

2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In this paper, we present a novel framework for handwritten numeral recognition. Considering unconstrained handwritten numerals as numeral feature vectors in the corresponding numeral vector space, we commence by reducing the coordinate dimensionalityof vector space by employing Spectral Regression Discriminant Analysis (SRDA). We then calculate the local density for all numeral classes. For each...

chapter

The THUEE system for the openKWS14 keyword search evaluation

Meng Cai, Zhiqiang Lv, Beili Song, Yongzhe Shi, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4734 - 4738

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The OpenKWS14 keyword search evaluation is one of the most challenging and influential evaluations in the field of speech recognition. Its goal is to build a high-performance keyword search system for a minority language with limited training data in a short period of time. We present the system of the Department of Electronic Engineering, Tsinghua University (THUEE team) for the OpenKWS14 keyword...

chapter

A Bayesian network approach for human reliability analysis of power system

Junxi Tang, Yingkai Bao, Licheng Wang, Haibo Lu, more

2013 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC) > 1 - 6

2013 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC)

Along with the improvement of equipment reliability, human error has become a great threat to the power system reliability and safety. However, the research of human reliability analysis in power system is still in its infancy. There is still little approach for quantitatively measuring the human reliability of power system. In this paper, the definition of human reliability of power system and a...

chapter

THUEE system for the Albayzin 2012 language recognition evaluation

Weiwei Liu, Wei-Qiang Zhang, Liang He, Jiaming Xu, more

2013 IEEE China Summit and International Conference on Signal and Information Processing > 109 - 112

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

Albayzin 2012 language recognition evaluation (LRE) is one of the most challenging language recognition evaluation, which is mainly reflected in: (1) the target languages are more confusable with other languages, which might push down the system performance; (2) developing and test data is heterogeneous regarding duration, number of speakers, ambient noise/music, channel conditions, etc. (3) signals...

chapter

Improve low-resource non-native mispronunciation detection with native speech by articulatory-based tandem feature

Hua Yuan, Ji Xu, Junhong Zhao, Jia Liu

2013 IEEE China Summit and International Conference on Signal and Information Processing > 127 - 131

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

In this paper, we propose a method to improve detecting the mispronunciation type of the non-native learners. In order to cope with the low-resource condition of non-native speech and the difference of native and non-native speech, the following efforts are made: 1) train acoustic model with the low-resource non-native data; 2) introduce the articulatory-based tandem feature; 3) pool auxiliary native...

chapter

Improving deep neural network acoustic models using unlabeled data

Meng Cai, Wei-Qiang Zhang, Jia Liu

2013 IEEE China Summit and International Conference on Signal and Information Processing > 137 - 141

2013 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP)

The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a powerful acoustic modeling technique. Its training process typically involves unsupervised pre-training and supervised fine-tuning. In the paper, we demonstrate that the performance of DNNs can be improved by utilizing a large amount of unlabeled data in the training procedure. In our method, CD-DNN-HMM trained using 309 hours of unlabeled...

chapter

Automatic pitch accent detection using auto-context with acoustic features

Junhong Zhao, Wei-Qiang Zhang, Hua Yuan, Jia Liu, more

2012 8th International Symposium on Chinese Spoken Language Processing > 247 - 251

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In prosody event detection field, many local acoustic features have been proposed for representing the prosody characteristics of speech unit. The context information that represents some possible regularities underlying neighboring prosody events, however, hasn't been used effectively. The main difficulty to utilize prosodic context is that it's hard to capture the long-distance sequential dependency...

chapter

The Research of Alphabet Identification Based on Genetic BP Neural Network

Lina Liu, Huijuan Qi, Jia Liu

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics > 1 > 25 - 28

2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

The Back Propagation (BP) neural network genetic algorithm was used to identify alphabet, and the new algorithm combine the advantages of both genetic algorithm and the BP neural network. Genetic learning algorithm was used for the global optimization and BP training algorithm to accurately optimize the neural network weights and training the neural network to learn letter recognition algorithm. Add-noise...

chapter

Speaker classification based on high dimension feature vector

Yi Yang, Hui Song, Jia Liu

2011 Seventh International Conference on Natural Computation > 2 > 891 - 894

2011 Seventh International Conference on Natural Computation (ICNC)

Audio index is an important part of NIST-RT-SD evaluation since 2003. Speaker Diarization is one kind of audio index technology which is marked by different speakers. One essential component of speaker diarization is speaker clustering which is always the pre-processing of speech recognition. The general method is to extract acoustic feature such as LPCC or MFCC and achieve some model such as HMM...

Keywords:
TRAINING
Publication type:
book

Publication date

Set your own date range

Keywords

SPEECH (17)
SPEECH RECOGNITION (11)
ACOUSTICS (8)
FEATURE EXTRACTION (7)
HIDDEN MARKOV MODELS (7)
NEURAL NETWORKS (7)
SUPPORT VECTOR MACHINES (5)
ARTIFICIAL NEURAL NETWORKS (4)
DATA MODELS (4)
TRAINING DATA (4)
DEEP NEURAL NETWORK (3)
KERNEL (3)
LATTICES (3)
NIST (3)
SUPPORT VECTOR MACHINE CLASSIFICATION (3)
ACOUSTIC MODELING (2)
ALGORITHM DESIGN AND ANALYSIS (2)
BIOLOGICAL NEURAL NETWORKS (2)
BP NEURAL NETWORK (2)
DATABASES (2)
ERROR ANALYSIS (2)
GENETIC ALGORITHM (2)
GENETIC ALGORITHMS (2)
KEYWORD SEARCH (2)
LANGUAGE RECOGNITION (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
NATURAL LANGUAGE PROCESSING (2)
PROBABILITY (2)
SIGNAL TO NOISE RATIO (2)
TESTING (2)
TRANSFORMS (2)
ACCURACY (1)
ACOUSTIC (1)
ACOUSTIC SIGNAL DETECTION (1)
ACOUSTIC SIGNAL PROCESSING (1)
ADAPTATION MODELS (1)
ADDITIVE NOISE (1)
AGGLOMERATIVE HIERARCHICAL CLUSTERING (AHC) (1)
ALBAYZIN 2012 LANGUAGE RECOGNITION EVALUATION (LRE) (1)
ALPHABET IDENTIFICATION (1)
ANTENNA ARRAYS (1)
ARRAY SIGNAL PROCESSING (1)
ARTICULATORY FEATURE (1)
AUTO-CONTEXT (1)
AUTOMATIC LANGUAGE IDENTIFICATION (1)
AUTOMATIC MULTILINGUAL SPEECH RECOGNITION (1)
BACKPROPAGATION (1)
BAYES METHODS (1)
BAYESIAN NETWORK (1)
BILINGUAL SPEECH RECOGNITION (1)
BIOLOGICAL SYSTEM MODELING (1)
BOTTLENECK FEATURE (1)
BUILDINGS (1)
CELLULAR NETWORKS (1)
CLEANING (1)
CLUSTERING METHODS (1)
CMLLR (1)
COLOR (1)
COLOR RING BACK TONE (1)
COMPUTATIONAL AUDITORY SCENE ANALYSIS (CASA) (1)
COMPUTER ARCHITECTURE (1)
CONNECTIONIST TEMPORAL CLASSIFICATION (1)
CONTEXT (1)
CONTRACTS (1)
CONVOLUTION (1)
COVARIANCE MATRICES (1)
CTC (1)
DATA SPARSENESS (1)
DEEP NEURAL NETWORK (DNN) (1)
DEFAULT RISK (1)
DELAY EFFECTS (1)
DEPARTMENT OF ELECTRONIC ENGINEERING (1)
DETECTION ALGORITHMS (1)
DISCRIMINATIVE TRAINING (1)
DISCRIMINATIVE TRAINING TECHNIQUES (1)
EDUCATIONAL INSTITUTIONS (1)
END-TO-END SYSTEM (1)
EQUAL ERROR RATE (1)
ERROR STATISTICS (1)
FEATURE SELECTION ALGORITHM (1)
FIRST LINE DEFENSE (1)
FMPE (1)
FREQUENCY-DOMAIN ANALYSIS (1)
FUZZY NEURAL NETWORKS (1)
GARCH (1)
GATED RECURRENT UNITS (1)
GAUSSIAN MIXTURE MODEL (1)
GAUSSIAN PROCESSES (1)
GENETICS (1)
GMM SUPERVECTOR (1)
HANDWRITING RECOGNITION (1)
HANDWRITTEN NUMERAL RECOGNITION (1)
HIDDEN MARKOV MODEL (HMM) (1)
HIGH-DIMENSION SVM (1)
HUMAN RELIABILITY (1)
I-VECTOR (1)
I-VECTOR (IVEC) (1)
IMPROVED BP NEURAL NETWORK TECHNOLOGY (1)
INDEXES (1)
more

INFONA - science communication portal

Search results for: Jia Liu

Ivec-PLDA-AHC priors for VB-HMM speaker diarization system

Hybrid-beamforming-based millimeter-wave cellular network optimization

An LSTM-CTC based verification system for proxy-word based OOV keyword search

A speech enhancement algorithm using computational auditory scene analysis with spectral subtraction

Application of i-vector in speech and music classification

Lattice based transcription loss for end-to-end speech recognition

Gated recurrent units based hybrid acoustic models for robust speech recognition

Intrusion Detection Techniques Based on Improved Intuitionistic Fuzzy Neural Networks

THUEE language modeling method for the OpenKWS 2015 evaluation

Convolutional maxout neural networks for speech separation

Stacked bottleneck features for speaker verification

A fast handwritten numeral recognition framework based on peak densities

The THUEE system for the openKWS14 keyword search evaluation

A Bayesian network approach for human reliability analysis of power system

THUEE system for the Albayzin 2012 language recognition evaluation

Improve low-resource non-native mispronunciation detection with native speech by articulatory-based tandem feature

Improving deep neural network acoustic models using unlabeled data

Automatic pitch accent detection using auto-context with acoustic features

The Research of Alphabet Identification Based on Genetic BP Neural Network

Speaker classification based on high dimension feature vector

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results for: Jia Liu

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options