The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In human speech, most boundaries between phones/words are fuzzy. If a time slice which only includes a sole boundary is given, it is possible that the boundary may locate at any frame within the slice. Different boundary locations form several potential observation segments, which should have similar acoustic spaces because of their neighboring trait in time domain. We call them neighboring segments...
With increasing demands for a natural interaction between human and machine, emotion perception from speech signals is becoming an important interaction interface. In this paper, we give a feature extraction framework for speech emotion recognition and present a novel method to extract emotion information based on group sparsity in tensor space. The speech signal is encoded as cortical representation...
The stochastic segment model (SSM) has been shown to be a competitive alternative to the hidden Markov model (HMM). In this paper, we extend the theory of Maximum Likelihood Linear Regression adaptation (MLLR), which is widely used in HMM-based system, to the stochastic segment model, and derive the SSM-based MLLR adaptation method. Continuous speech recognition experiment using the SSM-based MLLR...
In this paper, a two-stage multi-speaker identification (SID) system is proposed for mixed speeches with multiple speakers speaking simultaneously. By investigating the second stage processing, we improved the performance of multi-speaker SID from 94.6% to 99.0% on a standard testing set, and comparing with another state-of-art system, the proposed results were also a little better. We also examined...
In this paper, a novel one-pass coarse-to-fine decoding algorithm is proposed to accelerate the speed of segment model (SM). The algorithm is originated from the segmentation similarity observation described in the paper and is specific for the SM based speech recognition. At each step, a coarse search is first implemented to get coarse segmentations and then a fine search is performed based on the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.