Speaker identification using multi-step clustering algorithm with transformation-based GMM

Limin Xu; Zhenmin Tang

doi:10.3103/S0146411607040062

Speaker identification using multi-step clustering algorithm with transformation-based GMM

Limin Xu, Zhenmin Tang

Source

Automatic Control and Computer Sciences > 2007 > 41 > 4 > 224-231

Abstract

To improve the performance of speaker recognition, the embedded linear transformation is used to integrate both transformation and diagonal-covariance Caussian mixture into a unified framework. In the case, the mixture number of GMM must be fixed in model training. The cluster expectation-maximization (EM) algorithm is a well-known technique in which the mixture number is regarded as an estimated parameter. This paper presents a new model structure that integrates a multi-step cluster algorithm into the estimating process of GMM with the embedded transformation. In the approach, the transformation matrix, the mixture number and model parameters are simultaneously estimated according to a maximum likelihood criterion. The proposed method is demonstrated on a database of three data sessions for text independent speaker identification. The experiments show that this method outperforms the traditional GMM with cluster EM algorithm.