Comparison of Large Margin Training to Other Discriminative Methods for Phonetic Recognition by Hidden Markov Models

Fei Sha; Lawrence K. Saul

doi:10.1109/ICASSP.2007.366912

Comparison of Large Margin Training to Other Discriminative Methods for Phonetic Recognition by Hidden Markov Models

Source

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 > 4 > IV-313 - IV-316

Abstract

In this paper we compare three frameworks for discriminative training of continuous-density hidden Markov models (CD-HMMs). Specifically, we compare two popular frameworks, based on conditional maximum likelihood (CML) and minimum classification error (MCE), to a new framework based on margin maximization. Unlike CML and MCE, our formulation of large margin training explicitly penalizes incorrect decodings by an amount proportional to the number of mislabeled hidden states. It also leads to a convex optimization over the parameter space of CD-HMMs, thus avoiding the problem of spurious local minima. We used discriminatively trained CD-HMMs from all three frameworks to build phonetic recognizers on the TIMIT speech corpus. The different recognizers employed exactly the same acoustic front end and hidden state space, thus enabling us to isolate the effect of different cost functions, parameterizations, and numerical optimizations. Experimentally, we find that our framework for large margin training yields significantly lower error rates than both CML and MCE training.