The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper advances the design of CTC-based all-neural (or end-to-end) speech recognizers. We propose a novel symbol inventory, and a novel iterated-CTC method in which a second system is used to transform a noisy initial output into a cleaner version. We present a number of stabilization and initialization methods we have found useful in training these networks. We evaluate our system on the commonly...
We describe a method for interpolation of class-based n-gram language models. Our algorithm is an extension of the traditional EMbased approach that optimizes perplexity of the training set with respect to a collection of n-gram language models linearly combined in the probability space. However, unlike prior work, it naturally supports context-dependent interpolation for class-based LMs. In addition,...
The goal of this work was to explore modeling techniques to improve bird species classification from audio samples. We first developed an unsupervised approach to obtain approximate note models from acoustic features. From these note models we created a bird species recognition system by leveraging a phone n-gram statistical model developed for speaker recognition applications. We found competitive...
Constrained cepstral systems, which select frames to match various linguistic “constraints” in enrollment and test, have shown significant improvements for speaker verification performance. Past work, however, relied on word recognition, making the approach language dependent (LD). We develop language-independent (LI) versions of constraints and compare results to parallel LD versions for English...
We investigate the problem of adapting a recognition system with multiple acoustic models to a new domain in unsupervised mode. We compare maximum likelihood and discriminative approaches for unsupervised domain adaptation. Different adaptation data selection methods and adaptation strategies are investigated, using a baseline meeting recognition system and adaptation data from a congressional committee...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.