Learning-based auditory encoding for robust speech recognition

Yu-Hsiang Bosco Chiu; Bhiksha Raj; Richard M Stern

doi:10.1109/ICASSP.2010.5495666

Source

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 4278 - 4281

Abstract

This paper describes ways of speeding up the optimization process for learning physiologically-motivated components of a feature computation module directly from data. During training, word lattices generated by the speech decoder and conjugate gradient descent were included to train the parameters of logistic functions in a fashion that maximizes the a posteriori probability of the correct class in the training data. These functions represent the rate-level nonlinearities found in most mammalian auditory systems. Experiments conducted using the CMU SPHINX-III system on the DARPA Resource Management and Wall Street Journal tasks show that the use of discriminative training to estimate the shape of the rate-level nonlinearity provides better recognition accuracy in the presence of background noise than traditional procedures which do not employ learning. More importantly, the inclusion of conjugate gradient descent optimization and a word lattice to reduce the number of hypotheses considered greatly increases the training speed, which makes training with much more complicated models possible.

Identifiers

book ISSN :	1520-6149
book ISBN :	978-1-4244-4295-9
book e-ISBN :	978-1-4244-4296-6
DOI	10.1109/ICASSP.2010.5495666

Keywords

speech recognition acoustic noise acoustic signal processing conjugate gradient methods hearing speech coding conjugate gradient descent optimization learning-based auditory encoding robust speech recognition physiologically-motivated components feature computation module word lattices speech decoder conjugate gradient logistic functions posteriori probability rate-level nonlinearities mammalian auditory systems discriminative training background noise Training Accuracy Noise Speech Lattices Hidden Markov models data analysis automatic speech recognition auditory models

Additional information

Data set: ieee

Publisher

IEEE

INFONA - science communication portal

Learning-based auditory encoding for robust speech recognition

Source

Abstract

Identifiers

Authors

Chiu, Y.-H.B.

Raj, B.

Stern, R.M.

Keywords

Additional information

Publisher


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Learning-based auditory encoding for robust speech recognition $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

Chiu, Y.-H.B.

Raj, B.

Stern, R.M.

Keywords

Additional information

Publisher

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Learning-based auditory encoding for robust speech recognition