The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, a two-layer Gaussian Mixed Model (GMM) structure for Vector Taylor Series (VTS) feature compensation is proposed for robust speech recognition. Since GMM with the numerous mixture components is used for VTS, the computation complexity of VTS is extremely huge. To deal with this issue, we propose two-layer GMM structure for VTS. In detail, the GMM with fewer mixture components is utilized...
The research on noisy Tibetan speech recognition algorithm based on wavelet neural network (WNN) combined with auditory feature was carried out in this paper. The recognition classifier based on WNN was designed, and Mel Frequency Cepstrum Constant (MFCC) feature was given. Then the simulation on the given algorithm was run under the different signal to noise ratios (SNR), and the results illustrated...
A key challenge in rapidly building Tibetan language speech recognition applications is minimizing the manual effort required in transcribing and labeling speech data. Accurate labeling of Tibetan speech utterances is extremely time consuming and requires trained linguists. For alleviate this problem, we present an approach that aims at reducing the amount of manually transcribed speech data required...
In the researches on Tibetan language speech recognition, accurate labeling of Tibetan speech utterances is extremely time consuming and requires trained linguists. For alleviate this problem, we present an approach that can use few labeled Tibetan speech utterances to construct the effective recognition model. The experimental results show that our approach has better performance than traditional...
This paper proposes a novel robust speech recognition technique using improved vector Taylor series (VTS) algorithm for embedded systems. It uses a hidden Markov model (HMM) to replace the Gaussian mixture model (GMM) for estimating the clean speech feature, and gives the closed-form solutions of the noise parameters including the mean and variance at each expectation-maximization (EM) iteration....
This paper presents a new model adaptation algorithm using piecewise linear transformation (PLT) for robust speech recognition. In this algorithm, the nonlinear relationship between training and testing mean vectors is approximated by a set of piecewise linear transformations. The PLT coefficients are estimated from adaptation data by the expectation-maximization (EM) algorithm and maximum likelihood...
Aiming at the problem of Tibetan speech recognition under the condition of resistance from noise, a kind of Tibetan speech recognition algorithm, combining RBF network with auditory feature was presented in this paper. The description for the Tibetan speech signals was carried out with Mel frequency cepstrum constant (MFCC), and the recognition classifier was designed based on RBF network with the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.