The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The paper deals with affective computing to improve the performance of Human-Machine interaction. The focus of this work is to detect affective state of a human using speech processing techniques primarily intended for call centre applications. Limited work is reported till date on affect detection using phase derived features. A unique combination of Group delay (GD), Phase delay (PD), One Sided...
Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper...
In this paper, we propose to use deep neural network (DNN) as an effective tool for audio feature extraction. The DNN-derived features can be effectively used in a subsequent classifier (e.g., an SVM in this study) for audio classification. Specifically, we learn bottleneck features from a multi-layer perceptron (MLP), in which Mel filter bank feature is used as network input and one of the hidden...
The method which is called the “tandem approach” in speech recognition has been shown to increase performance by using classifier posterior probabilities as observations in a hidden Markov model. We study the effect of using visual tandem features in audio-visual speech recognition using a novel setup which uses multiple classifiers to obtain multiple visual tandem features. We adopt the approach...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.