Kazumasa Yamamoto

chapter

Lyric recognition in monophonic singing using pitch-dependent DNN

Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 326 - 330

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

One of the difficulties in sung speech recognition is the small distance in an acoustic space between phonemes in sung speech. Therefore we considered clustering the speech based on a pitch (fundamental frequency F0) and creating a larger distance between the phonemes. In addition, we considered a two-stage training method of DNN-HMM: the first stage is trained by using conventional acoustic features...

chapter

A deep neural network integrated with filterbank learning for speech recognition

Hiroshi Seki, Kazumasa Yamamoto, Seiichi Nakagawa

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5480 - 5484

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Deep neural networks (DNN) have achieved significant success in the field of speech recognition. One of the main advantages of the DNN is automatic feature extraction without human intervention. Therefore, we incorporate a pseudo-filterbank layer to the bottom of DNN and train the whole filterbank layer and the following networks jointly, while most systems take pre-defined mel-scale filterbanks as...

chapter

Soft-clustering technique for training data in Age-and gender-independent speech recognition

Daisuke Enami, Faqiang Zhu, Kazumasa Yamamoto, Seiichi Nakagawa

Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference > 1 - 4

2012 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

In this paper, we propose approaches for the Gaussian mixture model (GMM) based soft clustering of training data and the GMM- or/and hidden Markov model (HMM)-based cluster selection in age and gender-independent speech recognition. Typically, increasing the number of speaker classes leads to more specific models in speaker-class-dependent speech recognition, and thus better recognition performance...

chapter

Automatic speech recognition using Hidden Conditional Neural Fields

Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5036 - 5039

ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Hidden Conditional Random Fields(HCRF) is a very promising approach to model speech. However, because HCRF computes the score of a hypothesis by summing up linearly weighted features, it cannot consider non-linearity among features that will be crucial for speech recognition. In this paper, we extend HCRF by incorporating gate function used in neural networks and propose a new model called Hidden...

INFONA - science communication portal

Search results for: Kazumasa Yamamoto

Lyric recognition in monophonic singing using pitch-dependent DNN

A deep neural network integrated with filterbank learning for speech recognition

Soft-clustering technique for training data in Age-and gender-independent speech recognition

Automatic speech recognition using Hidden Conditional Neural Fields

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results for: Kazumasa Yamamoto

Lyric recognition in monophonic singing using pitch-dependent DNN

A deep neural network integrated with filterbank learning for speech recognition

Soft-clustering technique for training data in Age-and gender-independent speech recognition

Automatic speech recognition using Hidden Conditional Neural Fields

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options