Search results for: Li Rong

Items from 21 to 32 out of 32 results

chapter

Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis

Zhen-Hua Ling, Zhi-Guo Wang, Li-Rong Dai

2010 7th International Symposium on Chinese Spoken Language Processing > 144 - 147

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

In current hidden Markov model(HMM) based unit selection speech synthesis method, the optimal phone-sized candidate units are selected following the maximum likelihood(ML) criterion of the HMMs trained for various acoustic features. This paper introduces the statistical models for syllable-level F0 features into this method. Different from the frame-level F0 parameters used in the current framework,...

chapter

GMM-based voice conversion with explicit modelling on feature transform

Ling-Hui Chen, Zhen-Hua Ling, Wu Guo, Li-Rong Dai

2010 7th International Symposium on Chinese Spoken Language Processing > 364 - 368

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

In this paper, we propose a Gaussian mixture model (GMM) based voice conversion method using explicit feature transform models. A piecewise linear transform with stochastic bias is adopted to present the relationship between the spectral features of source and target speakers. This explicit transformations are integrated into the training of GMM for the joint probability density of source and target...

chapter

Automatic phrase boundary labeling for Mandarin TTS corpus using context-dependent HMM

Chen-Yu Yang, Zhen-Hua Ling, Heng Lu, Wu Guo, more

2010 7th International Symposium on Chinese Spoken Language Processing > 374 - 377

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

In this paper, an automatic prosodic phrase boundary labeling method for speech synthesis database is presented. This method can be divided into two stages: training stage and labeling stage. In training stage, context-dependent HMM, which is commonly adopted in the HMM-based parametric speech synthesis, is estimated using the training database with manual prosodic labeling. In labeling stage, the...

chapter

Investigation of prosodie FO layers in hierarchical FO modeling for HMM-based speech synthesis

Ming Lei, Yi-Jian Wu, Zhen-Hua Ling, Li-Rong Dai

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS > 613 - 616

2010 10th International Conference on Signal Processing (ICSP 2010)

To address the overall-micro modeling issue of current prosody model in HMM-based speech synthesis, a hierarchical F0 modeling method has been proposed, in which different kinds of pittch patterns are characterized by different prosodie layers and an minimum generation error (MGE) training framework is used to simultaneous optimize F0 models of all layers. This paper investigate the importance of...

chapter

Minimum generation error training with weighted Euclidean distance on LSP for HMM-based speech synthesis

Ming Lei, Zhen-Hua Ling, Li-Rong Dai

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4230 - 4233

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper presents a minimum generation error (MGE) training method using weighted Euclidean distance measure on line spectral pairs (LSP) for HMM-based speech synthesis. In this paper, weighted Euclidean distance on LSP is introduced as the measurement of generation error to improve the consistency between the model training criterion and the subjective perception on the distortion of synthetic...

chapter

HMM-based pseudo-clean speech synthesis for splice algorithm

Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4570 - 4573

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

In this paper, we present a novel approach to relax the constraint of stereo-data which is needed in a series of algorithms for noise-robust speech recognition. As a demonstration in SPLICE algorithm, we generate the pseudo-clean features to replace the ideal clean features from one of the stereo channels, by using HMM-based speech synthesis. Experimental results on aurora2 database show that the...

chapter

Full covariance state duration modeling for HMM-based speech synthesis

Heng Lu, Yi-Jian Wu, K. Tokuda, Li-Rong Dai, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4033 - 4036

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a state duration modeling method using full covariance matrix for HMM-based speech synthesis. In this method, a full covariance matrix instead of the conventional diagonal covariance matrix is adopted in the multi-dimensional Gaussian distribution to model the state duration of each context-dependent phoneme. At synthesis stage, the state durations are predicted using the clustered...

chapter

Multi-Layer F0 Modeling for HMM-Based Speech Synthesis

Cheng-Cheng Wang, Zhen-Hua Ling, Bu-Fan Zhang, Li-Rong Dai

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

This paper proposes a two-layer fundamental frequency (FO) modeling method for HMM-based parametric speech synthesis. The FO models are trained for each context- dependent phoneme in the conventional HMM-based speech synthesis system. Considering the super-segmental characteristics of FO features, an explicit syllable-layer FO model is introduced in this paper. At synthesis stage, the FO contour is...

chapter

Constructing scalable TTS system based on Corpus approach

Wei Zhang, Zheng-hua Ling, Li-rong Dai

2008 IEEE Conference on Cybernetics and Intelligent Systems > 230 - 235

2008 IEEE Conference on Cybernetics and Intelligent Systems

Pruning redundant synthesis instances or tailoring TTS voice font is an important issue of Corpus-based TTS. But pruning redundant synthesis instances, usually results in loss of non-uniform. In order to solve this problem, this paper proposes the concept of virtual non-uniform. According to this concept and the synthesis frequency of each instance, the algorithm named StaRp-VPA is constructed as...

chapter

Semantic Computing in Scalable Text-to-Speech System

Zhang Wei, Pang Min-hui, Dai Li-rong

IEEE International Workshop on Semantic Computing and Systems > 113 - 118

2008 IEEE International Workshop on Semantic Computing and Systems (WSCS 2008)

Because of diversity of hardware environments, building scalable text-to-speech system is an important issue of Corpus-based text-to-speech system. This paper proposes and analyses three semantic computing problems of building scalable text to speech system: similarity calculation, granular computing and automated instances-pruning process framework. According to these, an acoustic clustering algorithm-NuClustering-VPA...

chapter

Minimum generation error criterion considering global/local variance for HMM-based speech synthesis

Long Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang, more

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4621 - 4624

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

Due to the inconsistency between the maximum likelihood (ML) based training and the synthesis application in HMM-based speech synthesis, a minimum generation error (MGE) criterion had been proposed for HMM training. This paper continues to apply the MGE criterion to model adaptation for HMM-based speech synthesis. We propose a MGE linear regression (MGELR) based model adaptation algorithm, where the...

chapter

Minumum generation error linear regression based model adaptation for HMM-based speech synthesis

Long Qin,, Yi-Jian Wu,, Zhen-Hua Ling,, Ren-Hua Wang,, more

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 3953 - 3956

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

Keywords:
SPEECH SYNTHESIS

Publication date

Set your own date range

INFONA - science communication portal

Search results for: Li Rong

Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis

GMM-based voice conversion with explicit modelling on feature transform

Automatic phrase boundary labeling for Mandarin TTS corpus using context-dependent HMM

Investigation of prosodie FO layers in hierarchical FO modeling for HMM-based speech synthesis

Minimum generation error training with weighted Euclidean distance on LSP for HMM-based speech synthesis

HMM-based pseudo-clean speech synthesis for splice algorithm

Full covariance state duration modeling for HMM-based speech synthesis

Multi-Layer F0 Modeling for HMM-Based Speech Synthesis

Constructing scalable TTS system based on Corpus approach

Semantic Computing in Scalable Text-to-Speech System

Minimum generation error criterion considering global/local variance for HMM-based speech synthesis

Minumum generation error linear regression based model adaptation for HMM-based speech synthesis

Filter options

Publication date

Publication type

Keywords

Data set

Journal

INFONA - science communication portal

Search results for: Li Rong

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options