Limited resource term detection for effective topic identification of speech

Jonathan Wintrode; Sanjeev Khudanpur

doi:10.1109/ICASSP.2014.6854981

Limited resource term detection for effective topic identification of speech

Source

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 7118 - 7122

Abstract

We consider the task of identifying topics in recorded speech across many languages. We identify a statistically discriminative set of topic keywords, and examine the relationship between overall word error rate (WER), keyword-specific detection performance, and topic identification (Topic ID) performance on the Fisher Spanish corpus. Building increasingly constrained systems — from copious to limited training LVCSR to limited-vocabulary keyword spotting — we show that neither high WER (>60%) nor low-precision term detection (<40%) are necessarily impediments to Topic ID. By using deep neural net acoustic models for keyword spotting, we can double recall and ranked retrieval performance over comparable PLP-based models and achieve Topic ID performance on par with well-trained LVCSR or human transcripts.