Search results for: Jian Wu

Items from 1 to 9 out of 9 results

chapter

Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS

Feng-Long Xie, Yi-Jian Wu, Frank K. Soong

2012 8th International Symposium on Chinese Spoken Language Processing > 60 - 63

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In HMM-based speech synthesis, context-dependent hidden Markov model (HMM) is widely used for its capability to synthesize highly intelligible and fairly smooth speech. However, to train HMMs of all possible contexts well is difficult, or even impossible, due to the intrinsic, insufficient training data coverage problem. As a result, thus trained models may over fit and their capability in predicting...

chapter

Synthesizing visual speech trajectory with minimum generation error

Lijuan Wang, Yi-Jian Wu, Xiaodan Zhuang, Frank K. Soong

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4580 - 4583

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper, we propose a minimum generation error (MGE) training method to refine the audio-visual HMM to improve visual speech trajectory synthesis. Compared with the traditional maximum likelihood (ML) estimation, the proposed MGE training explicitly optimizes the quality of generated visual speech trajectory, where the audio-visual HMM modeling is jointly refined by using a heuristic method...

chapter

Minimum generation error training by using original spectrum as reference for log spectral distortion measure

Yi-Jian Wu, K. Tokuda

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4013 - 4016

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper improves a minimum generation error (MGE) based HMM training technique for HMM-based speech synthesis by directly using the original spectrum instead of line spectral pairs (LSPs) as reference spectrum for log spectral distortion (LSD) measure. Two types of original reference spectra for LSD calculation are investigated, including the spectrum extracted from speech waveform by STRAIGHT,...

chapter

Full covariance state duration modeling for HMM-based speech synthesis

Heng Lu, Yi-Jian Wu, K. Tokuda, Li-Rong Dai, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4033 - 4036

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a state duration modeling method using full covariance matrix for HMM-based speech synthesis. In this method, a full covariance matrix instead of the conventional diagonal covariance matrix is adopted in the multi-dimensional Gaussian distribution to model the state duration of each context-dependent phoneme. At synthesis stage, the state durations are predicted using the clustered...

chapter

Cross-lingual speech recognition under runtime resource constraints

Dong Yu, Li Deng, Peng Liu, Jian Wu, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4193 - 4196

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes and compares four cross-lingual and bilingual automatic speech recognition techniques under the constraint that only the acoustic model (AM) of the native language is used at runtime. The first three techniques fall into the category of lexicon conversion where each phoneme sequence (PHS) in the foreign language (FL) lexicon is mapped into the native language (NL) phoneme sequence...

chapter

Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis

Yi-Jian Wu, S. King, K. Tokuda

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

This paper explores a cross-lingual speaker adaptation technique for HMM-based speech synthesis, where a source voice model for English is transformed into a target speaker model using Mandarin Chinese speech data from the target speaker. A phone mapping- based method is adopted to map Chinese Initial/Finals into English phonemes and two types of mapping rules, including one-to-one and one-to-sequence...

chapter

Model Adaptation for HMM-Based Speech Synthesis under Minimum Generation Error Criterion

Long Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang

2008 Tenth IEEE International Symposium on Multimedia > 539 - 544

2008 Tenth IEEE International Symposium on Multimedia

In order to solve the issues related to the maximum likelihood (ML) based HMM training for HMM-based speech synthesis, a minimum generation error (MGE) criterion had been proposed. This paper continues to apply the MGE criterion to model adaptation for HMM-based speech synthesis. We introduce a MGE linear regression (MGELR) based model adaptation algorithm, where the transforms from source HMMs to...

chapter

Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition

Dong Yu, Li Deng, Jian Wu, Yifan Gong, more

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

Recently we have developed a non-linear feature-domain noise reduction algorithm based on the minimum mean square error (MMSE) criterion on Mel-frequency cepstra (MFCC) for environment-robust speech recognition. Our novel algorithm operates on the power spectral magnitude of the filter-bank's outputs and outperforms the log-MMSE spectral amplitude noise suppressor proposed by Ephraim and Malah in...

chapter

Analysis of stream-dependent tying structure for HMM-based speech synthesis

Zhi-Peng Yu, Yi-Jian Wu, H. Zen, Y. Nankaku, more

2008 9th International Conference on Signal Processing > 655 - 658

2008 9th International Conference on Signal Processing (ICSP 2008)

In conventional HMM-based speech synthesis framework, spectral features are modeled in one stream, and stream-dependent tree-based clustering was then applied for tying the model parameters. In this paper, we investigate several different stream-dependent tying structures for spectral features by splitting the feature vector into several streams. One splitting approach is to split each feature dimension...

Filter options

Keywords:
SPEECH

Publication date

Set your own date range

Keywords

HIDDEN MARKOV MODELS (8)
SPEECH SYNTHESIS (7)
HMM-BASED SPEECH SYNTHESIS (5)
MINIMUM GENERATION ERROR (4)
TRAINING (4)
CONTEXT MODELING (3)
SPEECH RECOGNITION (3)
ACOUSTICS (2)
ADAPTATION MODEL (2)
CONTEXT (2)
COVARIANCE MATRICES (2)
COVARIANCE MATRIX (2)
GAUSSIAN DISTRIBUTION (2)
HMM (2)
PATTERN CLUSTERING (2)
REGRESSION ANALYSIS (2)
TRAINING DATA (2)
ACOUSTIC MODEL (1)
ALGORITHM DESIGN AND ANALYSIS (1)
BILINGUAL AUTOMATIC SPEECH RECOGNITION TECHNIQUES (1)
CEPSTRAL ANALYSIS (1)
CLUSTERED CONTEXT-DEPENDENT DISTRIBUTION (1)
COMPLEXITY THEORY (1)
CONTEXT CLUSTERING (1)
CONTEXT-DEPENDENT HIDDEN MARKOV MODELS (1)
CONTEXT-DEPENDENT PHONEME (1)
CORRELATION (1)
CROSS VALIDATION (1)
CROSS-LINGUAL SPEAKER ADAPTATION (1)
CROSS-LINGUAL SPEECH RECOGNITION (1)
DATA MINING (1)
DATA MODELS (1)
DECISION TREES (1)
DISTANCE MEASUREMENT (1)
DISTORTION (1)
DISTORTION MEASUREMENT (1)
DURATION (1)
ENGLISH (1)
ERROR CORRECTION (1)
FAST FOURIER TRANSFORMS (1)
FOREIGN LANGUAGE LEXICON (1)
FULL COVARIANCE (1)
FULL COVARIANCE MATRIX STATE DURATION MODELING (1)
GAIN (1)
HARMONIC ANALYSIS (1)
HIDDEN MARKOV MODEL (1)
HMM TRAINING (1)
HMM-BASED SYNTHESIS (1)
INTERFERENCE SUPPRESSION (1)
INTERNATIONAL PHONETIC ALPHABET (1)
KULLBACK-LEIBLER DIVERGENCE (1)
LEAST MEAN SQUARES METHODS (1)
LEXICON CONVERSION (1)
LOG SPECTRAL DISTORTION (1)
LOG SPECTRAL DISTORTION MEASURE (1)
MANDARIN CHINESE SPEECH DATA (1)
MATHEMATICAL MODEL (1)
MAXIMUM LIKELIHOOD ESTIMATION (1)
MAXIMUM LIKELIHOOD LINEAR REGRESSION BASED MODEL ADAPTATION (1)
MEAN VECTOR (1)
MEL-FREQUENCY CEPSTRUM (1)
MERGING (1)
MGE LINEAR REGRESSION BASED MODEL ADAPTATION ALGORITHM (1)
MINIMUM GENERATION ERROR CRITERION (1)
MINIMUM GENERATION ERROR TRAINING (1)
MINIMUM-MEAN-SQUARE-ERROR NOISE SUPPRESSOR (1)
MODEL ADAPTATION (1)
MULTIDIMENSIONAL GAUSSIAN DISTRIBUTION (1)
NATIVE LANGUAGE (1)
NATURAL LANGUAGE PROCESSING (1)
NOISE (1)
NONLINEAR FEATURE-DOMAIN NOISE REDUCTION ALGORITHM (1)
ONE-TO-ONE MAPPINGS (1)
ONE-TO-SEQUENCE MAPPINGS (1)
ORIGINAL SPECTRUM EXTRACTION (1)
PARAMETER TUNING ALGORITHM (1)
PHONE MAPPING- BASED METHOD (1)
PHONEME SEQUENCE (1)
PHONETIC-AND-PROSODIC- CONTEXT-DEPENDENT MODELS (1)
PHOTO-REAL (1)
REGRESSION MATRICES (1)
RESOURCE CONSTRAINT (1)
ROBUST SPEECH RECOGNITION (1)
RUNTIME RESOURCE CONSTRAINTS (1)
SENONE MAPPING (1)
SHORT-TIME FFT SPECTRUM CALCULATION (1)
SIGNAL TO NOISE RATIO (1)
SOURCE VOICE MODEL (1)
SPEAKER SIMILARITY (1)
SPECTRAL ANALYSIS (1)
SPEECH ENHANCEMENT (1)
SPEECH PROCESSING (1)
SPEECH WAVEFORM (1)
STEP-ADAPTIVE DISCRIMINATIVE LEARNING ALGORITHM (1)
STREAM-DEPENDENT TREE-BASED CLUSTERING (1)
STREAM-DEPENDENT TYING STRUCTURE (1)
SUBJECTIVE LISTENING TEST (1)
SYNTHESIZED SPEECH (1)
TALKING HEAD (1)
more

INFONA - science communication portal

Search results for: Jian Wu

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options