The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Long Short-Term Memory (LSTM) is the primary recurrent neural networks architecture for acoustic modeling in automatic speech recognition systems. Residual learning is an efficient method to help neural networks converge easier and faster. In this paper, we propose several types of residual LSTM methods for our acoustic modeling. Our experiments indicate that, compared with classic LSTM, our architecture...
Korean is an agglutinative language, in which pronunciations are affected by long-term context. In this paper, the long-time temporal information is investigated to improve Korean LVCSR. TRAP-based MLP features, which are able to utilize the scattered acoustic information over several hundred milliseconds, are employed to obtain additional information besides the conventional cepstral features. In...
In Korean, the pronunciations of phonemes are severely affected by their contexts. Thus, using phonemes directly translated from their written forms as basic units for acoustic modeling is problematic, as these units lack the ability to capture the complex pronunciation variations occurred in continuous speech. Allophone, a sub-phone unit in phonetics but served as independent phoneme in speech recognition,...
In this paper, we propose a method to improve detecting the mispronunciation type of the non-native learners. In order to cope with the low-resource condition of non-native speech and the difference of native and non-native speech, the following efforts are made: 1) train acoustic model with the low-resource non-native data; 2) introduce the articulatory-based tandem feature; 3) pool auxiliary native...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.