Chiori Hori

chapter

Speaker adaptive training for deep neural networks embedding linear transformation networks

Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, more

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4605 - 4609

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recently, a novel speaker adaptation method was proposed that applied the Speaker Adaptive Training (SAT) concept to a speech recognizer consisting of a Deep Neural Network (DNN) and a Hidden Markov Model (HMM), and its utility was demonstrated. This method implements the SAT scheme by allocating one Speaker Dependent (SD) module for each training speaker to one of the intermediate layers of the front-end...

chapter

Tuning intonation with pitch accent decomposition for HMM-based expressive speech synthesis

Jinfu Ni, Yoshinori Shiga, Chiori Hori

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 10

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

Expressive intonation makes focal prominence to give emphases that highlight the focus of speech. This paper describes a method for improving the expressiveness of HMM-based voices, particularly putting focal prominence on a word. Different from previous methods, our method exploits a speech corpus available for model training, without needing to record additional emphasis speech. This method employs...

chapter

Non-monologue HMM-based speech synthesis for service robots: A cloud robotics approach

Komei Sugiura, Yoshinori Shiga, Hisashi Kawai, Teruhisa Misu, more

2014 IEEE International Conference on Robotics and Automation (ICRA) > 2237 - 2242

2014 IEEE International Conference on Robotics and Automation (ICRA)

Robot utterances generally sound monotonous, unnatural, and unfriendly because their Text-to-Speech (TTS) systems are not optimized for communication but for text-reading. Here we present a non-monologue speech synthesis for robots. We collected a speech corpus in a non-monologue style in which two professional voice talents read scripted dialogues. Hidden Markov models (HMMs) were then trained with...

chapter

Multilingual Speech-to-Speech Translation System: VoiceTra

Shigeki Matsuda, Xinhui Hu, Yoshinori Shiga, Hideki Kashioka, more

2013 IEEE 14th International Conference on Mobile Data Management > 2 > 229 - 233

2013 14th IEEE International Conference on Mobile Data Management (MDM)

This study presents an overview of VoiceTra, which was developed by NICT and released as the world'fs first network-based multilingual speech-to-speech translation system for smartphones, and describes in detail its multilingual speech recognition, its multilingual translation, and its multilingual speech synthesis in regards to field experiments. We show the effects of system updates using the data...

chapter

WFST-Based Spoken Dialogue System on Smartphones -- Its Development and Implementation for Field Use

Etsuo Mizukami, Teruhisa Misu, Chiori Hori

2013 IEEE 14th International Conference on Mobile Data Management > 2 > 217 - 224

2013 14th IEEE International Conference on Mobile Data Management (MDM)

We proposed the WFSTDM which is an expandable and adaptable dialogue management platform. The WFSTDM combines various WFSTs and enables us to develop new dialogue management WFSTs necessary for rapid prototyping of spoken dialogue systems. In this paper, we illustrate the outline of the WFSTDM and introduce the WFSTDM builder, a network-based spoken dialogue system development tool. In addition, we...

article

Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction

Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, more

IEEE Transactions on Signal Processing > 2013 > 61 > 3 > 601 - 610

Noise reduction algorithms are widely used to mitigate noise effects on speech to improve the robustness of speech technology applications. However, they inevitably cause speech distortion. The tradeoff between noise reduction and speech distortion is a key concern in designing noise reduction algorithms. This study proposes a novel framework for noise reduction by considering this tradeoff. We regard...

chapter

Controlling the tradeoff property in a regularization framework for noise reduction

Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, more

2012 8th International Symposium on Chinese Spoken Language Processing > 201 - 205

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

The tradeoff between noise reduction and speech distortion is a key concern in designing noise reduction algorithms. We have proposed a regularization framework for noise reduction with the consideration of the tradeoff problem. We regard speech estimation as a functional approximation problem in a reproducing kernel Hilbert space (RKHS). In the estimation, the objective function is formulated to...

chapter

Acoustic space partition based on broad phonetic class for ensemble acoustic modeling

Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori, more

2012 8th International Symposium on Chinese Spoken Language Processing > 311 - 314

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

Ensemble acoustic modeling can be used to model different factors that cause variability of acoustic space, and provide different combination to improve the performance of automatic speech recognition (ASR). One of the main concerns is how to partition the training data set to several subsets based on which ensemble models are trained. In this study, we focus on ensemble acoustic modeling concerned...

chapter

Collecting sentences from web resources for constructing spontaneous Chinese language model

Xinhui Hu, Youzheng Wu, Shigeki Matsuda, Chiori Hori, more

2012 8th International Symposium on Chinese Spoken Language Processing > 197 - 200

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

In this paper, we present our work on collecting spontaneous texts from the Web for constructing a language model in a Chinese speech recognition system. The selection of spontaneous-like texts involves two steps: First, word-segmented web texts are selected using a perplexity-based approach in which the style-related words are strengthened by omitting infrequent topic words from similarity measurements...

chapter

A linear projection approach to environment modeling for robust speech recognition

Yu Tsao, Chien-Lin Huang, Shigeki Matsuda, Chiori Hori, more

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4329 - 4332

ICASSP 2012 - 2012 IEEE International Conference on Acoustics, Speech and Signal Processing

Use of a linear projection (LP) function to transform multiple sets of acoustic models into a single set of acoustic models is proposed for characterizing testing environments for robust automatic speech recognition. The LP function is an extension of the linear regression (LR) function used in maximum likelihood linear regression (MLLR) and maximum a posteriori linear regression (MAPLR) by incorporating...

chapter

Feature normalization and selection for robust speaker state recognition

Chien-Lin Huang, Yu Tsao, Chiori Hori, Hideki Kashioka

2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) > 102 - 105

2011 Oriental COCOSDA 2011 - International Conference on Speech Database and Assessments

In this paper, we propose an integration process of feature compensation and selection on the collective acoustic feature sets to derive a set of advanced acoustic features for speaker state recognition. For feature normalization, we perform a two-dimensional histogram equalization (2-D HEQ) normalization to reduce variability of speaker and speaking environment factors. For feature selection, we...

chapter

Automatic speech summarization applied to English broadcast news speech

Chiori Hori, Sadaoki Furui, Rob Malkin, Hua Yu, more

2002 IEEE International Conference on Acoustics, Speech, and Signal Processing > 1 > I-9 - I-12

Proceedings of ICASSP '02

This paper reports an automatic speech summarization method and experimental results using English broadcast news speech. In our proposed method, a set of words maximizing a summarization score indicating an appropriateness of summarization is extracted from automatically transcribed speech. This extraction is performed using a Dynamic Programming (DP) technique according to a target compression ratio...

INFONA - science communication portal

Search results for: Chiori Hori

Speaker adaptive training for deep neural networks embedding linear transformation networks

Tuning intonation with pitch accent decomposition for HMM-based expressive speech synthesis

Non-monologue HMM-based speech synthesis for service robots: A cloud robotics approach

Multilingual Speech-to-Speech Translation System: VoiceTra

WFST-Based Spoken Dialogue System on Smartphones -- Its Development and Implementation for Field Use

Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction

Controlling the tradeoff property in a regularization framework for noise reduction

Acoustic space partition based on broad phonetic class for ensemble acoustic modeling

Collecting sentences from web resources for constructing spontaneous Chinese language model

A linear projection approach to environment modeling for robust speech recognition

Feature normalization and selection for robust speaker state recognition

Automatic speech summarization applied to English broadcast news speech

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results for: Chiori Hori

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options