The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present a speech pre-processing scheme (SPPS) for robust speech recognition in the moving motorcycle environment. The SPPS is dynamically adapted during the run-time operation of the speech front-end, depending on short-time characteristics of the acoustic environment. In detail, the fast varying acoustic environment is modeled by GMM clusters based on which a selection function determines the...
This paper describes ways of speeding up the optimization process for learning physiologically-motivated components of a feature computation module directly from data. During training, word lattices generated by the speech decoder and conjugate gradient descent were included to train the parameters of logistic functions in a fashion that maximizes the a posteriori probability of the correct class...
Accurate voice activity detection (VAD) is important for robust automatic speech recognition (ASR) systems. We have proposed a statistical-model-based VAD using the long-term temporal information in speech, which shows good robustness against noise in an automobile environment. For further improvement, this paper describes a new method to exploit harmonic structure information with statistical models...
The paper proposes a study of a background noise classifier based on a pattern recognition approach using a neural network. The signals submitted to the neural network are characterised by means of a set of 12 MFCC (Mel frequency cepstral coefficient) parameters typically present in the front end of a mobile terminal. The performance of the classifier, evaluated in terms of percent misclassification,...
This paper presents the audio noise classification using Bark scale features and K-NN technique. This paper uses audio noise signal from NOISEX-92 (12 types). We determine the transfer functions from linear predictive coding (LPC) coefficient of noise signal on Bark scale and use K-NN technique to classify them. The results will be used for optimization of speech recognition model in the presence...
Myoelectric signals (MESs) from the speaker's mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.