A stacked sparse autoencoder based architecture for Punjabi and English spoken language classification using MFCC features

Vaibhav Arora; Pulkit Sood; Kumar Utkarsh Keshari

A stacked sparse autoencoder based architecture for Punjabi and English spoken language classification using MFCC features

Arora, Vaibhav, Sood, Pulkit, Keshari, Kumar Utkarsh

Source

2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom) > 269 - 272

Abstract

Spoken language classification is an important task in speech processing. It can serve as a preprocessing step in the pipeline of automated understanding of the semantics of human speech. The paper proposes a Sparse Autoencoder based architecture for Punjabi and English spoken language Classification. A number of shallow architectures namely Soft-max classifier, SVM and deep architectures namely Artificial Neural Networks, SVM with Sparse Auto encoder and Softmax with sparse auto encoder (with and without fine tuning) have been compared on the same task. For this purpose noisy speech samples of 1 second using frame based MFCC coefficients of speech samples have been considered. Comparison of Principal Component Analysis and Sparse autoencoder based features has also been done on the dataset.