The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
One of the difficulties in sung speech recognition is the small distance in an acoustic space between phonemes in sung speech. Therefore we considered clustering the speech based on a pitch (fundamental frequency F0) and creating a larger distance between the phonemes. In addition, we considered a two-stage training method of DNN-HMM: the first stage is trained by using conventional acoustic features...
Deep neural networks (DNN) have achieved significant success in the field of speech recognition. One of the main advantages of the DNN is automatic feature extraction without human intervention. Therefore, we incorporate a pseudo-filterbank layer to the bottom of DNN and train the whole filterbank layer and the following networks jointly, while most systems take pre-defined mel-scale filterbanks as...
In this paper, we propose approaches for the Gaussian mixture model (GMM) based soft clustering of training data and the GMM- or/and hidden Markov model (HMM)-based cluster selection in age and gender-independent speech recognition. Typically, increasing the number of speaker classes leads to more specific models in speaker-class-dependent speech recognition, and thus better recognition performance...
Hidden Conditional Random Fields(HCRF) is a very promising approach to model speech. However, because HCRF computes the score of a hypothesis by summing up linearly weighted features, it cannot consider non-linearity among features that will be crucial for speech recognition. In this paper, we extend HCRF by incorporating gate function used in neural networks and propose a new model called Hidden...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.