Search results for: S. Nakamura

Items from 1 to 6 out of 6 results

chapter

Improving spontaneous English ASR using a joint-sequence pronunciation model

H Hofmann, S Sakti, R Isotani, H Kawai, more

2010 4th International Universal Communication Symposium > 58 - 61

2010 4th International Universal Communication Symposium (IUCS 2010)

The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to occurring multiple pronunciation variants in the utterances. Previous approaches address the multiple pronunciation problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation...

chapter

Weighted finite state transducer based statistical dialog management

C. Hori, K. Ohtake, T. Misu, H. Kashioka, more

2009 IEEE Workshop on Automatic Speech Recognition&Understanding > 490 - 495

2009 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU 2009)

We proposed a dialog system using a weighted finite-state transducer (WFST) in which user concept and system action tags are input and output of the transducer, respectively. The WFST-based platform for dialog management enables us to combine various statistical models for dialog management (DM), user input understanding and system action generation, and then search the best system action in response...

chapter

An HMM-based Vietnamese speech synthesis system

Thang Tat Vu, Mai Chi Luong, S. Nakamura

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 116 - 121

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

This paper describes an approach to the realization of a Vietnamese speech synthesis system applying a technique whereby speech is directly synthesized from Hidden Markov models (HMMs). Spectrum, pitch, and phone duration are simultaneously modeled in HMMs and their parameter distributions are clustered independently by using decision tree-based context clustering algorithms. Several contextual factors...

chapter

Modeling characteristics of agglutinative languages with Multi-class language model for ASR system

I. Dawa, Y. Sagisaka, S. Nakamura

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 104 - 109

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

In this paper, we discuss a new language model that considers the characteristics of the agglutinative languages. We used Mongolian (a Cyrillic language system used in Mongolia) as an example from which to build the language model. We developed a Multi-class N-gram language model based on similar word clustering that focuses on the variable suffixes of a word in Mongolian. By applying our proposed...

chapter

Construction of Chinese conversational corpora for spontaneous speech recognition and comparative study on the trilingual parallel corpora

Xinhui Hu, R. Isotani, S. Nakamura

2009 Oriental COCOSDA International Conference on Speech Database and Assessments > 56 - 59

2009 Oriental COCOSDA International Conference on Speech Database and Assessments

In this paper, we describe the development of Chinese conversational segmented and POS-tagged corpora currently used in the NICT/ATR speech-to-speech translation system. Over 500 K manually checked utterances provide 3.5 M words of Chinese corpora. As far as we know, they are the largest conversational textual corpora; in the domain of travel. A set of three parallel corpora is obtained with the corresponding...

chapter

Optimal learning of P-Layer additive F0 models with cross-validation

S. Sakai, T. Kawahara, T. Shimizu, S. Nakamura

2009 IEEE International Conference on Acoustics, Speech and Signal Processing > 4245 - 4248

ICASSP 2009 - 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

In this paper, we present the derivation of the backfitting training algorithms for generic p-layer additive F₀ models for arbitrary positive integer p. We have presented the special cases of the algorithms with p = 2 and p = 3 that have been successfully applied to the modelings of Japanese and English F₀ contours, whereas the derivation of the algorithm was presented only for the two-layer case...

Filter options

Keywords:
NATURAL LANGUAGE PROCESSING

Publication date

Set your own date range

Keywords

SPEECH (4)
SPEECH RECOGNITION (4)
ACCURACY (3)
DATA MINING (3)
TRAINING (3)
DATA MODELS (2)
ENTROPY (2)
HIDDEN MARKOV MODELS (2)
PATTERN CLUSTERING (2)
SPEECH SYNTHESIS (2)
STATISTICAL ANALYSIS (2)
ADDITIVE MODELS (1)
ADDITIVES (1)
AGGLUTINATIVE LANGUAGES (1)
ATRASR ENGINE (1)
AUTOMATIC SPEECH RECOGNITION SYSTEM (1)
BACKFITTING TRAINING ALGORITHMS (1)
BIOLOGICAL SYSTEM MODELING (1)
BOSTON UNIVERSITY RADIO NEWS CORPUS (1)
BUCKEYE (1)
CHARACTER RECOGNITION (1)
CHINESE CONVERSATIONAL CORPORA (1)
CITIES AND TOWNS (1)
COMPUTATIONAL MODELING (1)
CONTEXT (1)
CORPUS-BASED NATURAL LANGUAGE PROCESSING (1)
CURVE FITTING (1)
CYRILLIC LANGUAGE SYSTEM (1)
DECISION TREE-BASED CONTEXT CLUSTERING ALGORITHM (1)
DECISION TREES (1)
DELTA MODULATION (1)
DIALOG MANAGEMENT (1)
ENGLISH AUTOMATIC SPEECH RECOGNITION SYSTEMS (1)
ENGLISH WORDS (1)
EQUATIONS (1)
FITTED CURVES SMOOTHNESS (1)
FUNDAMENTAL FREQUENCY (1)
HIDDEN MARKOV MODEL (1)
HMM-BASED (1)
HMM-BASED VIETNAMESE SPEECH SYNTHESIS SYSTEM (1)
HUMAN-TO-HUMAN SPOKEN DIALOG CORPUS (1)
HUMANS (1)
INTERACTIVE SYSTEMS (1)
INTERCHANGE FORMAT (1)
INTONATION MODELING (1)
JAPANESE WORDS (1)
JOINT SEQUENCE PRONUNCIATION MODEL (1)
JOINTS (1)
MATHEMATICAL MODEL (1)
MEAN RECIPROCAL RANKING (1)
MONGOLIAN LANGUAGE (1)
MONGOLIAN WORD (1)
MULTICLASS LANGUAGE MODEL (1)
MULTICLASS N-GRAM LANGUAGE MODEL (1)
NATURAL LANGUAGES (1)
NICT-ATR SPEECH-TO-SPEECH TRANSLATION SYSTEM (1)
PARAMETER DISTRIBUTION (1)
PHONE DURATION (1)
PHONETIC TRANSFORMATION EFFECTS (1)
PITCH MODEL (1)
POS-TAGGED CORPORA (1)
RESOURCE-DEFICIENT LANGUAGES (1)
SIMILAR WORD CLUSTERING (1)
SMOOTHING METHODS (1)
SPECTRUM MODEL (1)
SPOKEN LANGUAGE UNDERSTANDING (1)
SPONTANEOUS ENGLISH ASR (1)
SPONTANEOUS PHONEME SEQUENCE (1)
SPONTANEOUS SPEECH CORPUS (1)
STATISTICAL DISTRIBUTIONS (1)
STATISTICAL LEARNING (1)
STATISTICAL MODELS (1)
SYLLABLES (1)
TONAL LANGUAGE (1)
TONE (1)
TONE TYPE (1)
TRAINING DATA (1)
TRANSDUCERS (1)
TRILINGUAL PARALLEL CORPORA (1)
UTTERANCES (1)
VARIABLE SUFFIX (1)
VIETNAMESE SYNTHESIS (1)
WEIGHTED FINITE STATE TRANSDUCER (1)
WRITING (1)
more

INFONA - science communication portal

Search results for: S. Nakamura

Improving spontaneous English ASR using a joint-sequence pronunciation model

Weighted finite state transducer based statistical dialog management

An HMM-based Vietnamese speech synthesis system

Modeling characteristics of agglutinative languages with Multi-class language model for ASR system

Construction of Chinese conversational corpora for spontaneous speech recognition and comparative study on the trilingual parallel corpora

Optimal learning of P-Layer additive F0 models with cross-validation

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options