Non-negative matrix deconvolution in noise robust speech recognition

Antti Hurmalainen; Jort Gemmeke; Tuomas Virtanen

doi:10.1109/ICASSP.2011.5947376

Non-negative matrix deconvolution in noise robust speech recognition

Hurmalainen, Antti, Gemmeke, Jort, Virtanen, Tuomas

Source

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 4588 - 4591

Abstract

High noise robustness has been achieved in speech recognition by using sparse exemplar-based methods with spectrogram windows spanning up to 300 ms. A downside is that a large exemplar dictionary is required to cover sufficiently many spectral patterns and their temporal alignments within windows. We propose a recognition system based on a shift-invariant convolutive model, where exemplar activations at all the possible temporal positions jointly reconstruct an utterance. Recognition rates are evaluated using the AURORA-2 database, containing spoken digits with noise ranging from clean speech to −5 dB SNR. We obtain results superior to those, where the activations were found independently for each overlapping window.

Identifiers

book ISSN :	1520-6149
book e-ISSN :	1520-6149
book ISBN :	978-1-4577-0538-0
book e-ISBN :	978-1-4577-0539-7 , 978-1-4577-0537-3
DOI	10.1109/ICASSP.2011.5947376