The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Acoustic feature extraction from speech is a fundamental part in both automatic speech recognition and automatic speaker recognition. Mel-frequency cepstral coefficients (MFCCs) are widely used in both of the above two research directions. A new feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) has been demonstrated to perform superior in automatic speech recognition...
In this paper, we propose a speech emotion recognition system using both spectral and prosodic features. Most traditional systems have focused on spectral features or prosodic features. Since both the spectral and the prosodic features contain emotion information, it is believed that the combining of spectral features and prosodic features will improve the performance of the emotion recognition system...
Gaussian mixture models with an universal background model (UBM) have been the standard method for speaker recognition. Typically, maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) is used to adapt the means of the UBM. Together with the SVM modeling technique, these approaches can achieve excellent performance. MLLR is quite efficient when the amount of adaptation data is...
Modern lifestyles have increased the risk of suffering some kind of voice disorders. It is estimated that nearly 19% of the population have suffered from dysphonic voicing. It is very important to detect pathological voices automatically. Many classification methods have been used to detect the pathological voices automatically and got good results. In this paper, we focus on the automatic detection...
A novel voice conversion system using phoneme-based linear mapping functions on main vowel phonemes is proposed in this paper. Our voice conversion algorithm has the following three improvements. First, instead of using all the vocal tract resonance (VTR) vectors in the portion of a phoneme, we use the VTR vector at the steady-state of each phoneme to train phoneme-based GMM. Second, different linear...
A novel voice conversion system using formant mapping based on modified GMM technique is proposed in this paper. Compared with the traditional GMM technique, our modified GMM technique selects the stable frames automatically in each vowel phoneme for parameter extraction to avoid using the parameters in the transition part. With the spectral parameters extracted from the stable frames, phoneme-based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.