Search results for: Y. Nankaku

Items from 1 to 12 out of 12 results

chapter

Cross-lingual speaker adaptation for HMM-based speech synthesis considering differences between language-dependent average voices

Xianglin Peng, K Oura, Y Nankaku, K Tokuda

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS > 605 - 608

2010 10th International Conference on Signal Processing (ICSP 2010)

This paper proposes an improved cross-lingual speaker adaptation technique with considering the differences between language-dependent average voices in a Speech-to-Speech Translation system. A state mapping based method had been introduced for cross-lingual speaker adaptation in HMM-based speech synthesis. In this method, the transforms estimated from the input language are applied to average voice...

chapter

Factor analyzed voice models for HMM-based speech synthesis

K Kazumi, Y Nankaku, K Tokuda

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4234 - 4237

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent...

chapter

A Bayesian approach to HMM-based speech synthesis

K. Hashimoto, H. Zen, Y. Nankaku, T. Masuko, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4029 - 4032

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a new framework of speech synthesis based on the Bayesian approach. The Bayesian method is a statistical technique for estimating reliable predictive distributions by marginalizing model parameters. In the proposed framework, all processes for constructing the system can be derived from one single predictive distribution which represents the basic problem of speech synthesis directly...

chapter

Voice conversion based on simultaneous modelling of spectrum and F0

K. Yutani, Y. Uto, Y. Nankaku, A. Lee, more

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 3897 - 3900

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a simultaneous modeling of spectrum and F0 for voice conversion based on MSD (multi-space probability distribution) models. As a conventional technique, a spectral conversion based on GMM (Gaussian mixture model) has been proposed. Although this technique converts spectral feature sequences nonlinearly based on GMM, F0 sequences are usually converted by a simple linear function...

chapter

Stereo-based stochastic noise compensation based on trajectory GMMS

H. Zen, Y. Nankaku, K. Tokuda

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4577 - 4580

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

This paper proposes a novel stereo-based stochastic noise compensation technique based on trajectory GMMs. Although the GMM-based noise compensation techniques such as SPLICE work effective, their performance sometimes degrades due to the inappropriate dynamic characteristics caused by the frame-by-frame mapping. While the use of dynamic feature constraints on the mapping stage can alleviate this...

chapter

Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems

K. Oura, Y. Nankaku, T. Toda, K. Tokuda, more

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

A new integrated model for simultaneous modeling of linguistic and acoustic models, and a training algorithm is proposed. Usually, text-to-speech (TTS) systems based on the hidden Markov model (HMM) consist of text analysis and speech synthesis modules. Linguistic and acoustic model training are performed independently using different training data sets. Integrated model parameters were simultaneously...

chapter

Analysis of stream-dependent tying structure for HMM-based speech synthesis

Zhi-Peng Yu, Yi-Jian Wu, H. Zen, Y. Nankaku, more

2008 9th International Conference on Signal Processing > 655 - 658

2008 9th International Conference on Signal Processing (ICSP 2008)

In conventional HMM-based speech synthesis framework, spectral features are modeled in one stream, and stream-dependent tree-based clustering was then applied for tying the model parameters. In this paper, we investigate several different stream-dependent tying structures for spectral features by splitting the feature vector into several streams. One splitting approach is to split each feature dimension...

chapter

Acoustic modeling with contextual additive structure for HMM-based speech recognition

Y. Nankaku, K. Nakamura, H. Zen, K. Tokuda

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4469 - 4472

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

This paper proposes an acoustic modeling technique based on an additive structure of context dependencies for HMM-based speech recognition. Typical context dependent models, e.g., triphone HMMs, have direct dependencies of phonetic contexts, i.e., if a phonetic context is given, the Gaussian distribution is specified immediately. This paper assumes a more complex structure, an additive structure of...

chapter

Face Recognition Based on Separable Lattice HMMS

D. Kurata, Y. Nankaku, K. Tokuda, T. Kitamura, more

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 5 > V

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

In this paper, we propose separable lattice hidden Markov models, in which multiple hidden state sequences interact to model the observation on a lattice. The proposed model can be efficiently applied for modeling images, image sequences, 3-D object models and higher dimensional applications, due to the composite structure of Markov chains which reduces the complexity while retaining good properties...

chapter

Hidden Semi-Markov Model Based Speech Recognition System using Weighted Finite-State Transducer

K. Oura, H. Zen, Y. Nankaku, A. Lee, more

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 1 > I

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

In hidden Markov models (HMMs), state duration probabilities decrease exponentially with time. It would be an inappropriate representation of temporal structure of speech. One of the solutions for this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). Although a number of attempts to use explicit duration...

chapter

Estimating Trajectory Hmm Parameters Using Monte Carlo Em With Gibbs Sampler

H. Zen, Y. Nankaku, K. Tokuda, T. Kitamura

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 1 > I

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

In the present paper, the Monte Carlo EM (MCEM) algorithm with a Gibbs sampler is applied for estimating parameters of a trajectory HMM, which has been derived from an HMM by imposing explicit relationships between static and dynamic features. The trajectory HMM can alleviate two limitations of the HMM, which are i) constant statistics within a state, and ii) conditional independence of state output...

chapter

On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum

K. Nakamura, T. Toda, Y. Nankaku, K. Tokuda

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 1 > I

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

This paper describes a method for determining the vocal tract spectrum from articulatory movements using an hidden Markov models (HMMs). In the proposed system, articulatory parameters are generated from a TTS system and converted to acoustic features to be synthesized. Comparing with conventional GMM-based systems, the proposed system has two additional properties: 1) phonetic information given input...

Filter options

Publication date

Set your own date range

Keywords

HIDDEN MARKOV MODELS (12)
SPEECH (6)
SPEECH RECOGNITION (5)
SPEECH SYNTHESIS (5)
HMM (4)
HMM-BASED SPEECH SYNTHESIS (4)
SPEECH PROCESSING (3)
STATISTICAL DISTRIBUTIONS (3)
CONTEXT (2)
CONTEXT CLUSTERING (2)
CONTEXT MODELING (2)
DETERMINISTIC ANNEALING EM ALGORITHM (2)
MAXIMUM LIKELIHOOD ESTIMATION (2)
TRAINING (2)
TRAINING DATA (2)
3D OBJECT MODELS (1)
ACOUSTIC FEATURE COMPONENTS (1)
ACOUSTIC MODEL TRAINING (1)
ACOUSTIC MODELING (1)
ACOUSTIC SIGNAL PROCESSING (1)
ACOUSTICS (1)
ADAPTATION MODEL (1)
ADDITIVE COMPONENT DISTRIBUTIONS (1)
ADDITIVE STRUCTURE (1)
ANALYTICAL MODELS (1)
APPROXIMATION ALGORITHMS (1)
APPROXIMATION THEORY (1)
ARTICULATORY MOVEMENTS (1)
AVERAGE VOICE (1)
BAUM-WELCH ALGORITHM (1)
BAYES METHODS (1)
BAYESIAN METHODS (1)
CEPSTRAL ANALYSIS (1)
COMPLEX STRUCTURE (1)
COMPLEXITY THEORY (1)
COMPUTATIONAL MODELING (1)
CONTEXT CLUSTERING ALGORITHM (1)
CONTEXTUAL ADDITIVE STRUCTURE (1)
CONTEXTUAL DECISION TREE (1)
CORRELATION (1)
CORRELATION METHOD (1)
CORRELATION METHODS (1)
CROSS VALIDATION (1)
CROSS-LINGUAL SPEAKER ADAPTATION (1)
CROSS-LINGUAL SPEAKER ADAPTATION TECHNIQUE (1)
DATA MINING (1)
DATA MODELS (1)
DATA SET TRAINING (1)
DATABASES (1)
DECISION TREES (1)
DISCRETE SYMBOL (1)
DISTRIBUTION CONVOLUTION (1)
DYNAMIC FEATURES (1)
EIGENVOICE (1)
ELASTIC MATCHING (1)
EXPECTATION MAXIMIZATION ALGORITHM (1)
EXPECTATION-MAXIMISATION ALGORITHM (1)
F0 CONVERSION (1)
FACE RECOGNITION (1)
FACTOR ANALYSIS (1)
FACTOR ANALYZED VOICE MODEL (1)
FEATURE MAPPING TECHNIQUE (1)
GAUSSIAN DISTRIBUTION (1)
GAUSSIAN MIXTURE MODEL (1)
GAUSSIAN MIXTURE MODELS (1)
GAUSSIAN NOISE (1)
GAUSSIAN PROCESSES (1)
GIBBS SAMPLER (1)
GLOBAL LINEAR TRANSFORM (1)
GMM (1)
GMM-BASED MAPPING (1)
HIDDEN MARKOV MODEL (1)
HIDDEN SEMI-MARKOV MODEL (1)
HIDDEN STATE SEQUENCES (1)
HMM SET INTERPOLATION (1)
HMM-BASED SPEECH RECOGNITION (1)
IMAGE MATCHING (1)
IMAGE SEQUENCES (1)
INTEGRATED MODEL PARAMETER (1)
INTERPOLATION (1)
JOINTS (1)
LANGUAGE-DEPENDENT AVERAGE VOICE MODEL (1)
LINEAR FUNCTION (1)
LINGUISTIC MODEL TRAINING (1)
LOAD MODELING (1)
MARGINALIZING MODEL PARAMETER (1)
MATHEMATICAL MODEL (1)
MAXIMUM LIKELIHOOD FUNCTION (1)
MEL FREQUENCY CEPSTRAL COEFFICIENT (1)
MONTE CARLO EM (1)
MONTE CARLO METHODS (1)
MSD-GMM (1)
MSD-HMM (1)
MULTIPLE DECISION TREES (1)
MULTISPACE PROBABILITY DISTRIBUTION (1)
NATURAL LANGUAGE PROCESSING (1)
NOISE (1)
NOISE COMPENSATION (1)
NONLINEAR DISTORTION (1)
NONLINEAR SPECTRAL FEATURE SEQUENCE (1)
more

INFONA - science communication portal

Search results for: Y. Nankaku

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options