Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

chapter

Neural networks for supervised pitch tracking in noise

Kun Han, DeLiang Wang

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 1488 - 1492

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Determination of pitch in noise is challenging because of corrupted harmonic structure. In this paper, we extract pitch using supervised learning, where probabilistic pitch states are directly learned from noisy speech. We investigate two alternative neural networks modeling the pitch states given observations. The first one is the feedforward deep neural network (DNN), which is trained on static...

chapter

Single-channel speech separation with memory-enhanced recurrent neural networks

Felix Weninger, Florian Eyben, Bjorn Schuller

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3709 - 3713

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

In this paper we propose the use of Long Short-Term Memory recurrent neural networks for speech enhancement. Networks are trained to predict clean speech as well as noise features from noisy speech features, and a magnitude domain soft mask is constructed from these features. Extensive tests are run on 73 k noisy and reverberated utterances from the Audio-Visual Interest Corpus of spontaneous, emotionally...

chapter

Recurrent conditional random field for language understanding

Kaisheng Yao, Baolin Peng, Geoffrey Zweig, Dong Yu, more

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4077 - 4081

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recurrent neural networks (RNNs) have recently produced record setting performance in language modeling and word-labeling tasks. In the word-labeling task, the RNN is used analogously to the more traditional conditional random field (CRF) to assign a label to each word in an input sequence, and has been shown to significantly outperform CRFs. In contrast to CRFs, RNNs operate in an online fashion...

chapter

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

Felix Weninger, Shinji Watanabe, Yuuki Tachioka, Bjorn Schuller

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4623 - 4627

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper describes our joint efforts to provide robust automatic speech recognition (ASR) for reverberated environments, such as in hands-free human-machine interaction. We investigate blind feature space de-reverberation and deep recurrent de-noising auto-encoders (DAE) in an early fusion scheme. Results on the 2014 REVERB Challenge development set indicate that the DAE front-end provides complementary...

chapter

Social signal classification using deep blstm recurrent neural networks

Raymond Brueckner, Bjorn Schulter

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4823 - 4827

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Non-verbal speech cues play an important role in human communication such as expressing emotional states or maintaining the conversational flow. In this paper we investigate the effect of applying deep bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks to the Interspeech 2013 Computational Paralinguistics Social Signals Sub-Challenge dataset requiring frame-wise, speaker-independent...

chapter

On-line continuous-time music mood regression with deep recurrent neural networks

Felix Weninger, Florian Eyben, Bjorn Schuller

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5412 - 5416

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper proposes a novel machine learning approach for the task of on-line continuous-time music mood regression, i.e., low-latency prediction of the time-varying arousal and valence in musical pieces. On the front-end, a large set of segmental acoustic features is extracted to model short-term variations. Then, multi-variate regression is performed by deep recurrent neural networks to model longer-range...

chapter

Exploiting long-term temporal dependencies in NMF using recurrent neural networks with application to source separation

Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 6969 - 6973

ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper seeks to exploit high-level temporal information during feature extraction from audio signals via non-negative matrix factorization. Contrary to existing approaches that impose local temporal constraints, we train powerful recurrent neural network models to capture long-term temporal dependencies and event co-occurrence in the data. This gives our method the ability to “fill in the blanks”...

INFONA - science communication portal

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Neural networks for supervised pitch tracking in noise

Single-channel speech separation with memory-enhanced recurrent neural networks

Recurrent conditional random field for language understanding

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

Social signal classification using deep blstm recurrent neural networks

On-line continuous-time music mood regression with deep recurrent neural networks

Exploiting long-term temporal dependencies in NMF using recurrent neural networks with application to source separation

Filter options

Publication date

Keywords

INFONA - science communication portal

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) $("#expandableTitles").expandable();

Neural networks for supervised pitch tracking in noise

Single-channel speech separation with memory-enhanced recurrent neural networks

Recurrent conditional random field for language understanding

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

Social signal classification using deep blstm recurrent neural networks

On-line continuous-time music mood regression with deep recurrent neural networks

Exploiting long-term temporal dependencies in NMF using recurrent neural networks with application to source separation

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)