The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper introduced the semi-continuous Hidden Markov Model (HMM) and proposed a novel Dynamic Bayesian Network (DBN) model for dynamic speech emotion recognition. The former reduces the training complexity caused by mixture Gaussians by sharing the Condition Probability Densities (CPDs) of Gaussians among the states, and the latter adds a sub-state layer between state and observation layer based...
Recently, Activity Recognition (AR) has become a popular research topic and gained attention in the study field because of the increasing availability of sensors in consumer products, such as GPS sensors, vision sensors, audio sensors, light sensors, temperature sensors, direction sensors, and acceleration sensors. The availability of a variety of sensors creates many new opportunities for data mining...
This paper described a research project conducted to recognize to finding the relationship between EEG signals and Human emotions. EEG signals are used to classify three kinds of emotions, positive, neuter and negative. Firstly, literature research has been performed to establish a suitable approach for emotion recognition. Secondly, we extracted features from original EEG data using 4-order wavelet...
This paper proposes an approach to detect emotion from human speech employing majority voting technique over several machine learning techniques. The contribution of this work is in two folds: firstly it selects those features of speech which is most promising for classification and secondly it uses the majority voting technique that selects the exact class of emotion. Here, majority voting technique...
The paper is focused on automatic detection of fury emotion in audio records, using data extracted from the vocalic analysis of formants. We have studied speech prosody and voice inflexions and we recognised fury using classification algorithms applied to two databases, one with professional voices and another with normal voices, both of them recorded on the base of selected texts in Romanian language...
This project presents a speech-based control system for DRONE using Support Vector Machines (SVM). The set of controlling speeches consists of BACKWARD, FORWARD, HOLD ON, LANDING, MOVE UP, MOVE DOWN, TAKE OFF, TURN LEFT and TURN RIGHT are trained the SVM. The feature extraction of speech used in this study comprises of “fundamental frequency”, “Energy”, and Mel Frequency Cepstral Coefficient”. For...
In this paper we put effort on Effective Automatic Speech emotion recognition on Human computer Interaction for Tamil speech. Due to the unavailability of Tamil language database for emotion recognition, we built a database of emotional speech in Tamil. This database consists of 19 wave clips modulated with anger, joy, fear, neutral and sad. Then we extract cepstral based features like MFCC. A German...
Speech is one of the most promising models through which various human emotions such as happiness, anger, sadness, and normal state can be determined, apart from facial expressions. Researchers have proved that acoustic parameters of a speech signal such as energy, pitch, Mel frequency Cepstral Coefficient (MFCC) are vital in determining the emotion state of a person. There is an increasing need for...
The use of digital technology is growing at a very fast pace which led to the emergence of systems based on the cognitive infocommunications. The expansion of this sector impose the use of combining methods in order to ensure the robustness in cognitive systems.
Emotion recognition is crucially related with friendly and humanistic human-robot interaction. Our paper aims at developing a new kind of features to mandarin emotional speech signal based on multifractal theory. Firstly, phase space structure differentiate with respect of initials and finals indicate the fractal phenomenon during speech produce process. Further, positive largest Lyapunov exponent...
An emerging need for biometric Speaker Verification (SV) and Identification (SI) systems is necessary for wireless remote access security in goal to be less vulnerable against distortion due to speech coding. This paper presents results on recognition system performed on the decoded speech of the G.729 codec. To show the performance loss due to distortion in the decoding process step, we are oriented...
This paper presents a novel approach for emotional speech recognition. Instead of using a full length of speech for classification, the proposed method decomposes speech signals into component words, groups the words into segments and generates an acoustic model for each segment by using features such as audio power, MFCC, log attack time, spectrum spread and segment duration. Based on the proposed...
Lip movement has a close relationship with speech because the lips move when we talk. The idea behind this work is to extract the lip movement feature from the facial video and embed the movement feature into speech signal using information hiding technique. Using the proposed framework, we can provide advanced speech communication only using the speech signal that includes lip movement features,...
This paper addresses the problem of the automatic recognition of emotional states from speech recordings, especially those kind of emotions reflecting that the life or the human integrity are at risk. The paper compares the performance of two different systems: one being fed with speech signals recorded directly from the people (whole spectrum) and other one in which the speech signals are recorded...
A phoneme recognition system based on Discrete Wavelet Transforms (DWT) and Support Vector Machines (SVMs), is designed for multi-speaker continuous speech environments. Phonemes are divided into frames, and the DWTs are adopted, to obtain fixed dimensional feature vectors. For the multiclass SVM, the One-against-one method with the RBF kernel was implemented. To further improve the accuracies obtained,...
This paper presents the design of a digital hardware implementation based on Support Vector Machines (SVMs), for the task of multi-speaker phoneme recognition. The One-against-one multiclass SVM method, with the Radial Basis Function (RBF) kernel was considered. Furthermore, a priority scheme was also included in the architecture, in order to forecast the three most likely phonemes. The designed system...
We consider the effects of incorporating prior knowledge of features which correlate with phoneme identity as well as perceptual invariances into the design of SVM kernels for phoneme classification in high-dimensional spaces of acoustic waveforms of speech. To this end we explore products and linear combinations of polynomial and radial basis function kernels to design composite kernels which are...
With the increasing demand for spoken language interfaces in human-computer interactions, automatic recognition of emotional states from human speeches has become of increasing importance. Unfortunately, obtaining human annotations of emotion corpus to train a supervised system can become a laborious and costly effort. To address this, we explore active learning techniques with the objective of reducing...
Classifiers in Automatic Speech Recognition (ASR) aims to improve the generalization ability of the machine learning and improve the recognition accuracy in noisy environments. This paper discusses the classification performance of Hidden Markov Models (HMM) and Support Vector Machines (SVM) applied to a wavelet front end based ASR. The experiments are performed on speaker independent TIMIT database...
Even if the progress of Hidden Markov Models (HMM) is huge, those models lack a discriminatory ability especially on speech recognition. In order to ameliorate the results of recognition systems, we apply Support Vectors Machine (SVM) as an estimator of posterior probabilities since they are characterized by a high predictive power and discrimination. Moreover, they are based on a structural risk...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.