The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Linear source-filter models have been widely used by researchers as a front-end for speaker identification systems. It uses the cepstral features derived from the power spectrum of the speech signal. But it is also well known that a significant part of the acoustic information cannot be modeled by the linear source-filter model, and thus, the need for nonlinear features becomes apparent. In this paper,...
A robust feature extraction technique using Teager Energy Operator (TEO) for Isolated Word Recognition (IWR) has been proposed in this paper. A feature extraction algorithm is motivated by the enhanced discrimination capability TEO that estimates the true energy of the source of a resonance. The robustness is further added using Cepstral Mean Normalization (CMN) on the estimated features. The robust...
This paper introduces a robust feature extraction algorithm for speech recognition. A feature extraction algorithm is motivated by the enhanced discrimination capability of Teager Energy Operator (TEO) that estimates the true energy of the source of a resonance. The robust features are computed from the speech signal of given frame through the following steps. First, the short time spectrum of each...
This paper describes polynomial kernel subspace approach to speaker recognition systems. Auditory motivated wavelet packet transform is used to derive the desirable speaker features. The nonlinear mapping between the input space and the feature space is implicitly performed using the kernel trick. This nonlinear mapping increases the discrimination capability of a pattern classifier. The use of Mel-scale...
This paper describes polynomial kernel subspace approach to Isolated Word Recognition (IWR) systems. Linear Predictive Coding (LPC) coefficients derived from wavelet sub-bands of speech frame were used as features. This approach represents mapping of speech features (input space) into a feature space via a non-linear mapping onto the principal components called Kernel Linear Discriminant Analysis...
In this paper, a nonlinear AM-FM speech model is used to extract robust features for speaker identification. The proposed features measure the amount of amplitude and frequency modulation that the commonly used linear source-filter model and the Mel frequency cepstral coefficients (MFCC) feature fails to capture. From the short time estimates of the frequency and bandwidth, a novel set of features...
This paper presents an isolated word recognition using polynomial classifier. Along with the high accuracy, speech recognition applications also required the low complexity and less storage space, which is achieved using the polynomial classifier. Speech features used are the well-known mel-frequency cepstral coefficient (MFCC). The performance of the said classifier is tested for MFCC of size 12...
Feature extraction in noisy condition is one of the most important issues in the speech recognition system. There are two dominant approaches of acoustic measurement. First is in temporal domain called parametric approach like linear prediction (LP) and second is in frequency domain called nonparametric approach like Mel frequency cepstral coefficients (MFCC) based on human auditory perception system...
This paper presents the use of auditory perception based admissible wavelet packet tree (WPT) for partitioning of speech frequencies into different bands based on the Mel scale or the Bark Scale. The proposed WPTs selected using root mean square error (RMSE) criterion mimic the Mel scale or the bark scale more accurately and hence the human auditory system. Performance of the features obtained from...
In this paper a new feature extraction methods, which utilize reduced order Linear Predictive Coding (LPC) coefficients for speech recognition, have been proposed. The coefficients have been derived from the speech frames decomposed using Discrete Wavelet Transform (DWT). In the literature it is assumed that the speech frame of size 10 msec to 30 msec is stationary, however, in practice different...
This paper presents a closed-set, text-independent speaker identification using continuous density hidden Markov model (CDHMM). Each registered speaker has a separate HMM which is trained using Baum-Welch algorithm. The system performance has been studied for different system parameters such as the number of states, number of mixture components per state and the amount of data required for training...
This paper presents a technique for face recognition that uses Gabor wavelets with five scales and eight orientations to derive desirable facial features characterized by spatial locality, spatial frequency, and orientation selectivity to cope with the variations due to illumination and facial expression changes. The fractional power polynomial kernel principal component analysis (KPCA) method maps...
In this paper, three simple algorithms for recovery of the phase of a discrete deterministic signal from its bispectrum have been proposed. The algorithms do not involve any phase unwrapping or any solution of a system of equations. Even though the discussion presented here is for deterministic signals, it has been shown that the algorithms can be used for estimation of the phase of linear stochastic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.