Block sparse excitation based all-pole modeling of speech

Ritwik Giri; Bhaskar D. Rao

doi:10.1109/ICASSP.2014.6854303

Block sparse excitation based all-pole modeling of speech

Source

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 3754 - 3758

Abstract

In this paper, it is shown that an appropriate model for voiced speech is an all-pole filter excited by a block sparse excitation sequence. The modeling approach is generalized in a novel manner to deal with a wide spectrum of speech signal; voiced speech, unvoiced speech and mixed excitation speech. In this context, the input sequence to the all-pole model is modeled as a suitable weighted linear combination of a block sparse signal and white noise. We develop the corresponding estimation procedure to reconstruct the generalized input sequence and model parameters via sparse Bayesian learning methods employing the Expectation-Maximization based procedure. Rigorous experiments have been performed to show the efficacy of our proposed model for the speech modeling task. By imposing a block sparse structure on the input sequence, the problems associated with the commonly used Linear Prediction approach is alleviated leading to a more robust modeling scheme.