Search results for: Yong Zhao

Items from 1 to 10 out of 10 results

chapter

3D emotional facial animation synthesis with factored conditional Restricted Boltzmann Machines

Yong Zhao, Dongmei Jiang, Hichem Sahli

2015 International Conference on Affective Computing and Intelligent Interaction (ACII) > 797 - 803

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

This paper presents a 3D emotional facial animation synthesis approach based on the Factored Conditional Restricted Boltzmann Machines (FCRBM). Facial Action Parameters (FAPs) extracted from 2D face image sequences, are adopted to train the FCRBM model parameters. Based on the trained model, given an emotion label sequence and several initial frames of FAPs, the corresponding FAP sequence is generated...

chapter

Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data

Yong Zhao, Jinyu Li, Jian Xue, Yifan Gong

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4310 - 4314

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

To develop speaker adaptation algorithms for deep neural network (DNN) that are suitable for large-scale online deployment, it is desirable that the adaptation model be represented in a compact form and learned in an unsupervised fashion. In this paper, we propose a novel low-footprint adaptation technique for DNN that adapts the DNN model through node activation functions. The approach introduces...

chapter

Stranded Gaussian mixture hidden Markov models for robust speech recognition

Yong Zhao, Biing-Hwang Juang

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4301 - 4304

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Gaussian mixture (GMM)-HMMs, though being the predominant modeling technique for speech recognition, are often criticized as being inaccurate to model heterogeneous data sources. In this work, we propose the stranded Gaussian mixture (SGMM)-HMM, an extension of the GMM-HMM, to explicitly model the dependence among the mixture components, i.e., each mixture component is assumed to depend on the previous...

chapter

A general discriminative training algorithm for speech recognition using weighted finite-state transducers

Yong Zhao, Andrej Ljolje, Diamantino Caseiro, Biing-Hwang Juang

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4217 - 4220

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we present a general algorithmic framework based on WFSTs for implementing a variety of discriminative training methods, such as MMI, MCE, and MPE/MWE. In contrast to the ordinary word lattices, the transducer-based lattices are more amenable to representing and manipulating the underlying hypothesis space and have a finer granularity at the HMM-state level. The transducers are processed...

chapter

Dimensional emotion driven facial expression synthesis based on the multi-stream DBN model

Hao Wu, Dongmei Jiang, Yong Zhao, Hichem Sahli

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 6

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

This paper proposes a dynamic Bayesian network (DBN) based MPEG-4 compliant 3D facial animation synthesis method driven by the (Evaluation, Activation) values in the continuous emotion space. For each emotion, a state synchronous DBN model (SS_DBN) is firstly trained using the Cohn-Kanade (CK) database with two streams of inputs: (i) the annotated (Evaluation, Activation) values, and (ii) the extracted...

chapter

Non-linear noise compensation for robust speech recognition using Gauss-Newton method

Yong Zhao, Biing-Hwang Juang

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4796 - 4799

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we present the Gauss-Newton method as a unified approach to optimizing non-linear noise compensation models, such as vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT). We demonstrate that the commonly used approaches that iteratively approximate the noise parameters in an EM framework are variants of the Gauss-Newton method. Through...

chapter

On noise estimation for robust speech recognition using vector Taylor series

Yong Zhao, Biing-Hwang Juang

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4290 - 4293

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

In this paper, we propose a novel noise variance estimation method using the fixed point method for the VTS-based robust speech recognition. Noise parameters are re-estimated over a given utterance using an EM algorithm. The derivative of the auxiliary function with respect to the noise variance is resolved, and the fixed point algorithm estimates the noise variance by recursively approximating the...

chapter

A study on recognizing distorted speech over local distributed transducer networks

Yong Zhao, Sunghwan Shin, E. Robledo-Arnuncio, Biing-Hwang Juang

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4181 - 4184

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In a collaborative scenario, a multiplicity of portable devices may constitute a network of distributed microphones, without a clearly defined geometric configuration or synchronization that can be taken advantage of for traditional microphone array processing to enhance the acquired signal. This application scenario represents a severe, but interesting challenge for automatic speech recognition systems...

chapter

Measuring Target Cost in Unit Selection with Kl-Divergence Between Context-Dependent HMMS

Yong Zhao, Peng Liu, Yusheng Li, Yining Chen, more

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 1 > I

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

This paper proposes a new approach for measuring the target cost in unit selection, where the difference between the target and candidate units is estimated by the Kullback-Leibler divergence (KLD) between the context-dependent hidden Markov models (HMM). In order to model the left/right phonetic context, biphone models are generated by merging regular tri-phone HMMs sharing the same left/right phonetic...

chapter

A novel method of recognizing ageing face based on EHMM

Ye Sun, Jian-Ming Zhang, Liang-Min Wang, Yong-Zhao Zhan, more

2005 International Conference on Machine Learning and Cybernetics > 8 > 4599 - 4604 Vol. 8

Proceedings of 2005 International Conference on Machine Learning and Cybernetics

The existing automatic methods of face recognition cannot recognize ageing faces with great changes in facial appearance. In this paper, a novel algorithm based on EHMM (embedded hidden Markov model) is presented to recognize the face with large ageing effects. Firstly, the non-linear relations between age and motions of key feature points in face are achieved by analyzing a great number of samples,...

Filter options

Keywords:
HIDDEN MARKOV MODELS

Publication date

Set your own date range

Keywords

SPEECH (6)
SPEECH RECOGNITION (6)
ROBUST SPEECH RECOGNITION (4)
DATA MODELS (2)
ESTIMATION (2)
FACE (2)
FACIAL ANIMATION (2)
FEATURE EXTRACTION (2)
NOISE (2)
NOISE MEASUREMENT (2)
SYNCHRONIZATION (2)
TRAINING (2)
TRANSDUCERS (2)
VECTOR TAYLOR SERIES (2)
ACCURACY (1)
ADAPTATION MODEL (1)
ADAPTATION MODELS (1)
AGEING FACE (1)
AGEING FACE RECOGNITION (1)
AGEING FACE TEXTURE (1)
AGEING PREDICTION (1)
ARRAY SIGNAL PROCESSING (1)
AUTOMATIC SPEECH RECOGNITION (1)
AUTOMATIC SPEECH RECOGNITION SYSTEMS (1)
AUXILIARY FUNCTION (1)
AVATARS (1)
BIPHONE MODELS (1)
COLLABORATIVE SCENARIO (1)
CONTEXT (1)
CONTEXT-DEPENDENT HMM (1)
DEEP NEURAL NETWORK (1)
DISCRIMINATIVE TRAINING (1)
DISTORTED SPEECH RECOGNITION (1)
DISTRIBUTED MICROPHONES (1)
DISTRIBUTED TRANSDUCER NETWORK (1)
DYNAMIC BAYESIAN NETWORK (1)
EHMM (1)
EMBEDDED HIDDEN MARKOV MODEL (1)
EXPECTATION-MAXIMISATION ALGORITHM (1)
EXPECTATION-MAXIMIZATION ALGORITHM (1)
FACE RECOGNITION (1)
FACIAL EXPRESSION SYNTHESIS (1)
FAP (1)
FCRBM (1)
FIXED POINT METHOD (1)
GAUSS-NEWTON METHOD (1)
GAUSSIAN MIXTURE MODEL (1)
HIDDEN MARKOV MODEL (1)
IMAGE RECONSTRUCTION (1)
IMAGE SAMPLING (1)
IMAGE SEQUENCES (1)
IMAGE TEXTURE (1)
IMAGE TEXTURE ANALYSIS (1)
JACOBIAN MATRICES (1)
KULLBACK-LEIBLER DIVERGENCE (1)
LATTICES (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LEARNING SYSTEMS (1)
LOCAL DISTRIBUTED TRANSDUCER NETWORKS (1)
LOW FOOTPRINT (1)
MAXIMUM LIKELIHOOD ESTIMATION (1)
MICROPHONE ARRAY PROCESSING (1)
MICROPHONES (1)
NEURAL NETWORKS (1)
NOISE ESTIMATION (1)
NOISE VARIANCE ESTIMATION METHOD (1)
NON-LINEAR COMPENSATION (1)
PARTIAL AGEING RATIO IMAGE (1)
PHONETIC CONTEXT (1)
PORTABLE DEVICES (1)
PROBABILITY (1)
PROSODY-SENSITIVE MONOPHONE (1)
ROBUSTNESS (1)
SAMPLING RATE SKEW (1)
SERIES (MATHEMATICS) (1)
SILICON (1)
SPEAKER ADAPTATION (1)
SPEECH SYNTHESIS (1)
SYNTHESIZED SPEECH SOUNDS (1)
SYSTEM COMBINATION (1)
THREE-DIMENSIONAL DISPLAYS (1)
TRAJECTORY (1)
TRANSFORM CODING (1)
VTS-BASED ROBUST SPEECH RECOGNITION (1)
WEIGHTED FINITE-STATE TRANSDUCER (1)
more

INFONA - science communication portal

Search results for: Yong Zhao

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options