The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Bilinear models based feature space Maximum Likelihood Linear Regression (FMLLR) speaker adaptation have showed good performance for GMM-HMMs especially when the amount of adaptation data is limited. In this paper, we propose using bilinear models feature as inputs to deep neural networks (DNNs) for rapid speaker adaptation of acoustic modeling to facilitate utterance-level normalization. The effectiveness...
This paper presents an improved acoustic keyword spotting (KWS) algorithm using a novel confusion garbage model in Mandarin conversational speech. Observing the KWS corpus, we found there are many words with similar pronunciation with predefined keywords, although they have different Chinese characters and different meanings, which easily result in high false alarm rate. In this paper, an improved...
This paper presents automatic pronunciation transliteration method with acoustic and contextual analysis for Chinese-English mixed language keyword spotting (KWS) system. More often, we need to develop robust Chinese-English mixed language spoken language technology without Chinese accented English acoustic data. In this paper, we exploit pronunciation conversion method based on syllable-based characteristic...
This paper gives an up-to-date description of the IBM Mandarin broadcast transcription system developed under the DARPA GALE program. Technical advances over our previous system include a novel acoustic modeling approach using subspace Gaussian mixture models, a speaking rate adaptation method using frame rate normalization, and an effective recipe for lattice combination. We present results on three...
The tone is a distinctive discriminative feature in Mandarin Chinese. Often functional, yet seldom thorough are most large-scale Mandarin speech recognition systems in treating tone modeling. In particular, many lack the necessary sophistication to deal with the myriad variations arising from the combination of acoustic and lexical contexts. This paper reports an attempt to account for these variabilities...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.