Sports audio segmentation and classification

Jun Huang; Yuan Dong; Jiqing Liu; Chengyu Dong; Haila Wang

doi:10.1109/ICNIDC.2009.5360872

Sports audio segmentation and classification

Jun Huang, Yuan Dong, Jiqing Liu, Chengyu Dong, Haila Wang

Source

2009 IEEE International Conference on Network Infrastructure and Digital Content > 379 - 383

Abstract

The audio stream is an important component of a sports video. In this paper, we present a system for audio segmentation and classification, which can segment and classify the sports audio stream into speech, non-speech very well. The novel point in our research is that we apply the segmentation and clustering method which is often used in speaker diarization system for broadcast news to the analysis of sports videos. After the segmentation and Bayesian Information Criterion (BIC) clustering is performed, Gaussian Mixture Model (GMM) is used in the classifier to identify the kind of sound for each segment. Experiments on a database composed of 6 hour audio stream in the Eurosport TV program show that the average accuracy can reach 87.3% on segmentation and classification. This research is very useful for analyzing the content of sports videos in detail.