Language modeling using stochastic automata with variable length contexts

Jianying Hu; William Turin; Michael K. Brown

doi:10.1006/csla.1996.0020

Language modeling using stochastic automata with variable length contexts

Jianying Hu, William Turin, Michael K. Brown

Source

Computer Speech & Language > 1997 > 11 > 1 > 1-16

Abstract

It is well known that language models are effective for increasing the accuracy of speech and handwriting recognizers, but large language models are often required to achieve low model perplexity (or entropy) and still have adequate language coverage. We study three efficient methods for variable order stochastic language modeling in the context of the stochastic pattern recognition problem. Two of these methods are previous techniques from recent literature, and one is a new method based on a successful text compression technique. We give results of a comparative analysis, and demonstrate that the best performance is achieved by extending one of the previous techniques using elements from the newly developed method.