The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes a model which approximates full covariance matrices in Gaussian mixture models (GMM) with a reduced number of parameters and computations required for likelihood evaluations. In the proposed model inverse covariance (precision) matrices are approximated using sparsely represented eigenvectors, i.e. each eigenvector of a covariance/precision matrix is represented as a linear combination...
In the automatic speech recognition task, the dominant approach is the statistical framework based on hidden Markov models in combination with Gaussian mixture models. The issues which should be solved are: how to obtain a statistically efficient estimation of model parameters, especially covariance matrix, whose number of parameters is proportional to the square of the dimensionality of the feature...
This paper presents a study of speaker recognition accuracy depending on the choice of features, window width and model complexity. The standard features were considered, such as linear and perceptual prediction coefficients (LPC and PLP) and mel-frequency cepstral coefficients (MFCC). Gaussian mixture model (GMM), with the use of HTK tools, was chosen for speaker modelling. Speech database S70W100s120,...
In this paper, the impact of the pitch on the variability of MFCC, and their influence on the performance of the automatic speech recognition system, is analyzed. In case that a speaker has a high pitch, the distance between adjacent harmonics in the spectrum of voiced phonemes is larger, which results in poorer description of the spectral envelope. Additional problem arises in the case that a band-pass...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.