The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Ground truth labels obtained by averaging or majority voting are commonly used to train automatic emotion classifiers. However, ground truth labels fail to encapsulate inter-annotator variability and ignore the subjectivity of emotions. In this paper, we propose two viable approaches to model the subjectiveness of emotions by incorporating inter-annotator variability, which are soft labels and model...
Most existing Speech Emotion Recognition (SER) systems rely on turn-wise processing, which aims at recognizing emotions from complete utterances and an overly-complicated pipeline marred by many preprocessing steps and hand-engineered features. To overcome both drawbacks, we propose a real-time SER system based on end-to-end deep learning. Namely, a Deep Neural Network (DNN) that recognizes emotions...
The introduction of Gaussian mixture models (GMMs) in the field of speaker verification has led to very good results. This paper illustrates an evolution in state-of-the-art speaker verification by highlighting the contribution of recently established information theoretic based vector quantization technique. We explore the novel application of three different vector quantization algorithms, namely...
This study presents automatic stress recognition methods based on acoustic speech analysis. Novel approaches to feature extraction based on the nonlinear Teager energy operator (TEO) calculated within critical bands, discrete wavelet transform bands, and wavelet packet bands are presented. The classification process was performed using two types of neural networks: the multilayer perceptron neural...
The speech signal is an important tool for conveying information between humans; at the same time, it is an indicator of a speaker's emotions. In this paper, the automatic identification of affect from speech containing spontaneously expressed (not acted) emotions within different environments was investigated. The teager energy operator-perceptual wavelet packet (TEO-PWP) features as well as the...
This study investigates effects of a clinical environment on speaker recognition rates. Two sets of speakers were used: a clinical set containing speech recordings of 70 clinically depressed speakers and a control set containing 68 non-depressed speakers. MFCC characteristic features were used to produce statistical models of speakers using four modeling methods: GMM_EM, GMM_K-means, GMM_LBG, and...
With suicidal behavior being linked to depression that starts at an early age of a person's life, many investigators are trying to find early tell-tale signs to assist psychologists in detecting clinical depression through acoustic analysis of a patient's speech. The purpose of this paper was to study the effectiveness of Mel frequency cepstral coefficients (MFCCs) in capturing the overall mental...
This study proposes a classification-based facial expression recognition method using a bank of multilayer perceptron neural networks. Six different facial expressions were considered. Firstly, logarithmic Gabor filters were applied to extract the features. Optimal subsets of features were then selected for each expression, down-sampled and further reduced in size via principal component analysis...
A novel method for facial expression recognition from sequences of image frames is described and tested. The expression recognition system is fully automatic, and consists of the following modules: face detection, maximum arousal detection, feature extraction, selection of optimal features, and facial expression recognition. The face detection is based on AdaBoost algorithm and is followed by the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.