Seiichi Nakagawa

chapter

Robust lecture speech translation for speech misrecognition and its rescoring effect from multiple candidates

Koya Sahashi, Norioki Goto, Hiroshi Seki, Kazumasa Yamamoto, more

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) > 1 - 6

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA)

We describe a scheme to translate spoken English lectures into Japanese consisting of a deep neural network based English automatic speech recognition system (ASR) and an English to Japanese phrase-based statistical machine translation system (SMT). The bad influence of speech misrecognition for the translation model is focused. For coping with bad influence caused by speech misrecognition, we utilized...

chapter

Lyric recognition in monophonic singing using pitch-dependent DNN

Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 326 - 330

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

One of the difficulties in sung speech recognition is the small distance in an acoustic space between phonemes in sung speech. Therefore we considered clustering the speech based on a pitch (fundamental frequency F0) and creating a larger distance between the phonemes. In addition, we considered a two-stage training method of DNN-HMM: the first stage is trained by using conventional acoustic features...

chapter

A deep neural network integrated with filterbank learning for speech recognition

Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5480 - 5484

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Deep neural networks (DNN) have achieved significant success in the field of speech recognition. One of the main advantages of the DNN is automatic feature extraction without human intervention. Therefore, we incorporate a pseudo-filterbank layer to the bottom of DNN and train the whole filterbank layer and the following networks jointly, while most systems take pre-defined mel-scale filterbanks as...

chapter

Domain adaptation of a speech translation system for lectures by utilizing frequently appearing parallel phrases in-domain

Norioki Goto, Kazumasa Yamamoto, Seiichi Nakagawa

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1 - 4

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

This paper describes our scheme to translate spoken English lectures into Japanese consisting of an English automatic speech recognition system (ASR) that utilizes a deep neural network (DNN) and an English to Japanese phrase-based statistical machine translation system (SMT). We focused on domain adaptation of the acoustic and translation models. For domain adaptation of the translation model, frequently...

chapter

Deep neural network based acoustic model using speaker-class information for short time utterance

Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) > 1222 - 1225

2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

In speech recognition, it is preferable not to hypothesize the details, e.g., specific age and gender, of a target user. However, speaker independence is one of the things that degrades ASR performance. In this work, we propose a speaker adaptation method to recognize a short time utterance. There have been several studies on speaker-independent DNN-HMM in which i-vector is computed, and the additional...

chapter

Elimination of person names in spoken documents for privacy protection

Ryo Kawaguchi, Masatoshi Tsuchiya, Seiichi Nakagawa

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific > 1 - 4

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

There is an increasing use of sensor networks capable of sensing multimedia data including audio data. Unfortunately, public use of these is not allowed because they contain crucial privacy information such as person and location names. Person name extraction (PNE), which is a widely investigated research topic, is an effective technique to resolve this problem. However, there is an important difference...

chapter

Automatic speech recognition using Hidden Conditional Neural Fields

Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5036 - 5039

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Hidden Conditional Random Fields(HCRF) is a very promising approach to model speech. However, because HCRF computes the score of a hypothesis by summing up linearly weighted features, it cannot consider non-linearity among features that will be crucial for speech recognition. In this paper, we extend HCRF by incorporating gate function used in neural networks and propose a new model called Hidden...

INFONA - science communication portal

Search results for: Seiichi Nakagawa

Robust lecture speech translation for speech misrecognition and its rescoring effect from multiple candidates

Lyric recognition in monophonic singing using pitch-dependent DNN

A deep neural network integrated with filterbank learning for speech recognition

Domain adaptation of a speech translation system for lectures by utilizing frequently appearing parallel phrases in-domain

Deep neural network based acoustic model using speaker-class information for short time utterance

Elimination of person names in spoken documents for privacy protection

Automatic speech recognition using Hidden Conditional Neural Fields

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results for: Seiichi Nakagawa

Robust lecture speech translation for speech misrecognition and its rescoring effect from multiple candidates

Lyric recognition in monophonic singing using pitch-dependent DNN

A deep neural network integrated with filterbank learning for speech recognition

Domain adaptation of a speech translation system for lectures by utilizing frequently appearing parallel phrases in-domain

Deep neural network based acoustic model using speaker-class information for short time utterance

Elimination of person names in spoken documents for privacy protection

Automatic speech recognition using Hidden Conditional Neural Fields

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options