The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Prediction of the prosodic phrase boundary is a potent influence on the performance of speech recognition and voice synthesis systems. We propose a statistical approach using efficient learning features for the natural prediction of the Korean prosodic phrase boundary. These new features reflect factors that affect the generation of the prosodic phrase boundary better than existing learning features...
The paper describes a comparison of the sensibility of recognizing emotions from human voices speaking Japanese and Korean. Our study focuses on the emotional elements included in the human voice, and our method uses Bayesian networks of prosodic features as models of Japanese's and Korean's sensibilities in recognizing emotions. The training datasets are prosodic features extracted from emotionally...
Much research recently in speaker recognition has been devoted to robustness due to microphone and channel effects. However, changes in vocal effort, especially whispered speech, present significant challenges in maintaining system performance. Due to the absence of any periodic excitation in whisper, the spectral structure in whisper and neutral speech will differ. Therefore, performance of speaker...
The CALO meeting assistant provides for distributed meeting capture, annotation, automatic transcription and semantic analysis of multiparty meetings, and is part of the larger CALO personal assistant system. This paper summarizes the CALO-MA architecture and its speech recognition and understanding components, which include real-time and offline speech transcription, dialog act segmentation and tagging,...
We describe how we were able to improve the accuracy of a medium-vocabulary spoken dialog system by rescoring the list of n-best recognition hypotheses using a combination of acoustic, syntactic, semantic and discourse information. The non-acoustic features are extracted from different intermediate processing results produced by the natural language processing module, and automatically filtered. We...
This paper presents a robust classification of dialog acts from text utterances. Two different types, namely, bag-of-words and syntactic relationship among words, were used to extract the discourse level features from the transcript of utterances. Subsequently a number of feature mining methods have been used to identify the most relevant features and their roles in classifying dialog acts. The selected...
A novel approach to pattern recognition which comprehensively optimizes both a feature extraction process and a classification process is introduced. Assuming that the best features for recognition are the ones that yield the lowest classification error rate over unknown data, an overall recognizer, consisting of a feature extractor module and a classifier module, is trained using the minimum classification...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.