Search results for: S. Nakamura

Items from 1 to 12 out of 12 results

chapter

Development and application of multilingual speech translation

S. Nakamura

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 9 - 12

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

This paper describes the latest version of handheld speech-to-speech translation system developed by National Institute of Information and Communications Technology, NICT. As the entire speech-to-speech translation functions are implemented into one terminal, it realizes real-time and location free speech-to-speech translation service for many language pairs. A new noise-suppression technique notably...

chapter

Toward translating Indonesian spoken utterances to/from other languages

S. Sakti, M. Paul, R. Maia, S. Sakai, more

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 137 - 142

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

This paper outlines the National Institute of Information and Communications Technology / Advanced Telecommunications Research Institute International (NICT/ATR) research activities in developing a spoken language translation system, specially for translating Indonesian spoken utterances into/from Japanese or English. Since the NICT/ATR Japanese-English speech translation system is an established...

chapter

An HMM-based Vietnamese speech synthesis system

Thang Tat Vu, Mai Chi Luong, S. Nakamura

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 116 - 121

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

This paper describes an approach to the realization of a Vietnamese speech synthesis system applying a technique whereby speech is directly synthesized from Hidden Markov models (HMMs). Spectrum, pitch, and phone duration are simultaneously modeled in HMMs and their parameter distributions are clustered independently by using decision tree-based context clustering algorithms. Several contextual factors...

chapter

Speech timing and cross-linguistic studies towards computational human modeling

Y. Sagisaka, H. Kato, M. Tsuzaki, S. Nakamura, more

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 1 - 8

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

In this paper, we introduce Japanese segmental duration characteristics and computational modeling that we have been studying for around three decades in speech synthesis. A series of experimental results are also shown on loudness dependence in the duration perception. These computational duration modeling and perceptual studies on duration error sensitivity to loudness give some insights for computational...

chapter

Optimal learning of P-Layer additive F0 models with cross-validation

S. Sakai, T. Kawahara, T. Shimizu, S. Nakamura

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4245 - 4248

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we present the derivation of the backfitting training algorithms for generic p-layer additive F₀ models for arbitrary positive integer p. We have presented the special cases of the algorithms with p = 2 and p = 3 that have been successfully applied to the modelings of Japanese and English F₀ contours, whereas the derivation of the algorithm was presented only for the two-layer case...

chapter

CART-based modeling of Chinese tonal patterns with a functional model tracing the fundamental frequency trajectories

Jinfu Ni, S. Sakai, T. Shimizu, S. Nakamura

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4253 - 4256

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

We propose an approach to modeling Chinese tonal patterns, focusing on the basic fundamental frequency (F₀) patterns characterized by the contextual linguistic features that can be directly extracted from text. We analyze tonal patterns as sparse target points (tonal F₀ peaks and valleys) and represent them in parametric form within the framework of a functional F₀ model. The relationships between...

chapter

Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model

J. Ni, S. Sakai, T. Shimizu, S. Nakamura

2008 Second International Symposium on Universal Communication > 397 - 404

2008 Second International Symposium on Universal Communication

Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F₀) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F₀ contours based on a functional F₀ model. We analyze tonal patterns as sparse target points (tonal F₀ peaks and valleys)...

chapter

Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems

K. Oura, Y. Nankaku, T. Toda, K. Tokuda, more

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

A new integrated model for simultaneous modeling of linguistic and acoustic models, and a training algorithm is proposed. Usually, text-to-speech (TTS) systems based on the hidden Markov model (HMM) consist of text analysis and speech synthesis modules. Linguistic and acoustic model training are performed independently using different training data sets. Integrated model parameters were simultaneously...

chapter

Frequency Modulation Technique for Prosodic Modification

Jinfu Ni, S. Sakai, T. Shimizu, S. Nakamura

2008 6th International Symposium on Chinese Spoken Language Processing > 1 - 4

2008 6th International Symposium on Chinese Spoken Language Processing

Modulation of speaking tone in frequency can make speech interesting and convey subtle meaning in communication. We present a frequency modulation (FM) technique for prosodic modification to consider communicative speech synthesis. This technique provides a mathematical formulation for representing speaking tone and manipulating FM in a unified framework. Two experiments are conducted with a text-to-speech...

chapter

On the state definition for a trainable excitation model in HMM-based speech synthesis

R. Maia, T. Toda, K. Tokuda, S. Sakai, more

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 3965 - 3968

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

One of the issues of speech synthesizers based on hidden Markov models concerns the vocoded quality of the synthesized speech. From the principle of analysis-by-synthesis speech coders a trainable excitation model has been proposed to improve naturalness, where the method consists in the design of a set of state-dependent filters in a way to minimize the distortion between residual and synthetic excitation...

chapter

Admissible stopping in viterbi beam search for unit selection in concatenative speech synthesis

S. Sakai, T. Kawahara, S. Nakamura

2008 IEEE International Conference on Acoustics, Speech and Signal Processing > 4613 - 4616

ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes

Corpus-based concatenative speech synthesis is very popular these days due to its highly natural speech quality. The amount of computation required in the run time, however, is often quite large and various approaches have been proposed for reducing this runtime computation. In this paper, we propose early stopping schemes for Viterbi beam search in the unit selection, with which we can stop early...

chapter

Use of Poisson Processes to Generate Fundamental Frequency Contours

Jinfu Ni, S. Nakamura

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '7 > 4 > IV-825 - IV-828

2007 IEEE International Conference on Acoustics, Speech, and Signal Processing

The prosodic contributions to voice fundamental frequency (F₀) contours can be analyzed into a series of sparser tonal targets (F₀ peaks and valleys). The transitions through these targets are interpolated by spline or filtering functions to predict the shape of F₀ contours. A functional model was proposed in the previous work for this purpose. This paper presents an enhanced version of this model...

Filter options

Keywords:
SPEECH SYNTHESIS

Publication date

Set your own date range

Keywords

SPEECH (9)
HIDDEN MARKOV MODELS (8)
TRAINING (6)
ACOUSTICS (3)
DATA MINING (3)
FEATURE EXTRACTION (3)
PROSODY MODELING (3)
SPEECH PROCESSING (3)
SPEECH RECOGNITION (3)
CART-BASED MODELING (2)
COMPUTATIONAL MODELING (2)
CONTEXTUAL LINGUISTIC FEATURES (2)
CORRELATION (2)
FUNDAMENTAL FREQUENCY CONTOURS (2)
HIDDEN MARKOV MODEL (2)
LANGUAGE TRANSLATION (2)
LINGUISTICS (2)
MODULATION (2)
NATURAL LANGUAGE PROCESSING (2)
TONE (2)
TRAINING DATA (2)
ACOUSTIC MEASUREMENTS (1)
ACOUSTIC MODEL TRAINING (1)
ACOUSTIC PARAMETERS (1)
ADAPTATION MODEL (1)
ADDITIVE MODELS (1)
ADDITIVES (1)
ADMISSIBLE STOPPING (1)
ADVANCED TELECOMMUNICATIONS RESEARCH INSTITUTE INTERNATIONAL (1)
ANALYSIS-BY-SYNTHESIS SPEECH CODERS (1)
BACKFITTING TRAINING ALGORITHMS (1)
BOSTON UNIVERSITY RADIO NEWS CORPUS (1)
CART (1)
CHINESE (1)
CHINESE CONVERSATIONAL SPEECH (1)
CHINESE INTONATION (1)
CHINESE TONAL PATTERNS (1)
CLASSIFICATION AND REGRESSION TREES (1)
COMMUNICATIVE SPEECH SYNTHESIS (1)
COMPUTATIONAL HUMAN MODELING (1)
CONCATENATIVE SPEECH SYNTHESIS (1)
CONTEXT (1)
CORPUS-BASED SPEECH (1)
CURRENT MEASUREMENT (1)
CURVE FITTING (1)
DATA MODELS (1)
DATA SET TRAINING (1)
DECAYING PROCESS (1)
DECISION TREE-BASED CONTEXT CLUSTERING ALGORITHM (1)
DECISION TREES (1)
DIGITAL FILTERS (1)
EARLY STOPPING SCHEMES (1)
ENGLISH (1)
ERROR STATISTICS (1)
F0 MODEL (1)
FAULT CURRENTS (1)
FILTERING FUNCTIONS (1)
FILTERING THEORY (1)
FITTED CURVES SMOOTHNESS (1)
FREQUENCY MODULATION (1)
FREQUENCY MODULATION TECHNIQUE (1)
FUNCTIONAL F<INF>0</INF> MODEL (1)
FUNCTIONAL MODEL TRACING (1)
FUNDAMENTAL FREQUENCY (1)
FUNDAMENTAL FREQUENCY TRAJECTORIES (1)
HANDHELD SPEECH-TO-SPEECH TRANSLATION SYSTEM (1)
HMM-BASED (1)
HMM-BASED VIETNAMESE SPEECH SYNTHESIS SYSTEM (1)
INDONESIAN SPEECH SYNTHESIZER (1)
INDONESIAN SPOKEN LANGUAGE TECHNOLOGY (1)
INDONESIAN-ENGLISH MACHINE TRANSLATORS (1)
INDONESIAN-JAPANESE MACHINE TRANSLATOR (1)
INTEGRATED MODEL PARAMETER (1)
INTONATION (1)
INTONATION MODELING (1)
JAPANESE (1)
JAPANESE-ENGLISH SPEECH TRANSLATION SYSTEM (1)
LARGE-SCALE CORPUS (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
LEXICAL TONE CONTEXT (1)
LINGUISTIC MODEL TRAINING (1)
MACHINE LEARNING (1)
MACHINE TRANSLATION (1)
MANDARIN (1)
MULTILINGUAL SPEECH TRANSLATION (1)
MULTILINGUAL SPEECH TRANSLATION SYSTEM (1)
NATIONAL INSTITUTE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY (1)
NATIVE SPEAKERS IDENTIFICATION (1)
NICKEL (1)
NOISE-SUPPRESSION TECHNIQUE (1)
PARAMETER DISTRIBUTION (1)
PARAMETERS ESTIMATION (1)
PATTERN CLUSTERING (1)
PHONE DURATION (1)
PHRASING MODEL TRAINING (1)
PITCH MODEL (1)
POISSON DISTRIBUTIONS (1)
POISSON-PROCESS-INDUCED FILTER (1)
PREDICTIVE MODELS (1)
more

INFONA - science communication portal

Search results for: S. Nakamura

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options