Mirco Ravanelli

artykuł

Automatic context window composition for distant speech recognition

Mirco Ravanelli, Maurizio Omologo

Speech Communication > 2018 > 101 > C > 34-44

Distant speech recognition is being revolutionized by deep learning, that has contributed to significantly outperform previous HMM-GMM systems. A key aspect behind the rapid rise and success of DNNs is their ability to better manage large time contexts. With this regard, asymmetric context windows that embed more past than future frames have been recently used with feed-forward neural networks. This...

rozdział

A network of deep neural networks for Distant Speech Recognition

Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4880 - 4884

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Despite the remarkable progress recently made in distant speech recognition, state-of-the-art technology still suffers from a lack of robustness, especially when adverse acoustic conditions characterized by non-stationary noises and reverberation are met.

rozdział

Batch-normalized joint training for DNN-based distant speech recognition

Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

2016 IEEE Spoken Language Technology Workshop (SLT) > 28 - 34

2016 IEEE Spoken Language Technology Workshop (SLT)

Improving distant speech recognition is a crucial step towards flexible human-machine interfaces. Current technology, however, still exhibits a lack of robustness, especially when adverse acoustic conditions are met. Despite the significant progress made in the last years on both speech enhancement and speech recognition, one potential limitation of state-of-the-art technology lies in composing modules...

rozdział

The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments

Mirco Ravanelli, Luca Cristoforetti, Roberto Gretter, Marco Pellin, więcej

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) > 275 - 282

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

This paper introduces the contents and the possible usage of the DIRHA-ENGLISH multi-microphone corpus, recently realized under the EC DIRHA project. The reference scenario is a domestic environment equipped with a large number of microphones and microphone arrays distributed in space. The corpus is composed of both real and simulated material, and it includes 12 US and 12 UK English native speakers...

rozdział

A multi-channel corpus for distant-speech interaction in presence of known interferences

Erich Zwyssig, Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4480 - 4484

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

This paper describes a new corpus of multi-channel audio data designed to study and develop distant-speech recognition systems able to cope with known interfering sounds propagating in an environment. The corpus consists of both real and simulated signals and of a corresponding detailed annotation. An extensive set of speech recognition experiments was conducted using three different Acoustic Echo...

rozdział

TANDEM-bottleneck feature combination using hierarchical Deep Neural Networks

Mirco Ravanelli, Van Hai Do, Adam Janin

The 9th International Symposium on Chinese Spoken Language Processing > 113 - 117

2014 9th International Symposium on Chinese Spoken Language Processing (ISCSLP)

To improve speech recognition performance, a combination between TANDEM and bottleneck Deep Neural Networks (DNN) is investigated. In particular, exploiting a feature combination performed by means of a multi-stream hierarchical processing, we show a performance improvement by combining the same input features processed by different neural networks. The experiments are based on the spontaneous telephone...

rozdział

A speech event detection and localization task for multiroom environments

Alessio Brutti, Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo

2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) > 157 - 161

2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA)

Domestic environments are particularly challenging for distant speech recognition and audio processing in general. Reverberation, background noise and interfering sources, as well as the propagation of acoustic events across adjacent rooms, critically degrade the performance of standard speech processing algorithms. The DIRHA EU project addresses the development of distant-speech interaction with...

rozdział

Audio concept classification with Hierarchical Deep Neural Networks

Mirco Ravanelli, Benjamin Elizalde, Karl Ni, Gerald Friedland

2014 22nd European Signal Processing Conference (EUSIPCO) > 606 - 610

2014 22nd European Signal Processing Conference (EUSIPCO)

Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although...

rozdział

Impulse response estimation for robust speech recognition in a reverberant environment

Mirco Ravanelli, Alessandro Sosi, Piergiorgio Svaizer, Maurizio Omologo

2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO) > 1668 - 1672

2012 20th European Signal Processing Conference

This paper refers to a voice-enabled smart-home scenario, for which contaminated speech is produced to train a distant-speech recognition system. The impulse response measurement process is investigated, with a specific focus on its impact on speech recognition performance. Experimental results, related to a phone-loop and to a word-loop task, show that a significant change of performance is obtained...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Mirco Ravanelli

Automatic context window composition for distant speech recognition

A network of deep neural networks for Distant Speech Recognition

Batch-normalized joint training for DNN-based distant speech recognition

The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments

A multi-channel corpus for distant-speech interaction in presence of known interferences

TANDEM-bottleneck feature combination using hierarchical Deep Neural Networks

A speech event detection and localization task for multiroom environments

Audio concept classification with Hierarchical Deep Neural Networks

Impulse response estimation for robust speech recognition in a reverberant environment

Opcje filtrowania

Data publikacji

Typ publikacji

Słowa kluczowe

Zbiór danych

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania dla: Mirco Ravanelli

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Zbiór danych

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu