Statistical and computational guarantees for the Baum-Welch algorithm

Fanny Yang; Sivaraman Balakrishnan; Martin J. Wainwright

doi:10.1109/ALLERTON.2015.7447067

Statistical and computational guarantees for the Baum-Welch algorithm

Yang, Fanny, Balakrishnan, Sivaraman, Wainwright, Martin J.

Source

2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) > 658 - 665

Abstract

The Hidden Markov Model (HMM) is one of the main-stays of statistical modeling of discrete time series and is widely used in many applications. Estimating an HMM from its observation process is often addressed via the Baum-Welch algorithm, which performs well empirically when initialized reasonably close to the truth. This behavior could not be explained by existing theory which predicts susceptibility to bad local optima. In this paper we aim at closing the gap and provide a framework to characterize a sufficient basin of attraction for any global optimum in which Baum-Welch is guaranteed to converge linearly to an “optimally” small ball around the global optimum. The framework is then used to determine the linear rate of convergence and a sufficient initialization region for Baum-Welch applied on a two component isotropic hidden Markov mixture of Gaussians.