The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recognition of epileptic seizures is an important issue and in certain circumstances it is desirable to have portable equipment implementing the algorithm in order to better monitor the patients. This work considers a widely used EEG database from University of Bonn as reference for comparing our recognition method with other previously reported. In order to perform epileptic seizures we combine a...
This paper presents improvements in terms of accuracy for shape object classification using a new low complexity method compared to previous implementation [1]. The method is using echoes generated by a JAVA platform capable of emulate sound propagation in a controlled 2D virtual environment [2][3]. Echoes originate from the ultrasonic waves generated inside a virtual environment which contains geometrical...
Support vector machine (SVM) algorithm received much attention in the research of voiceprint recognition, especially for small sample datasets. However, with the increase of recognition number and speech features number, the rate of model training and recognition is significantly reduced. In order to solve the problem, a new weighted clustering algorithm is proposed, which use “one to one” SVM model...
Speaker identification systems are becoming more important in today's world. This is especially true as devices rely on the user to speak commands. In this article, an analysis of how a text-independent voice identification system can be built is presented. Extracting the Mel-Frequency Cepstral Coefficients is evaluated and a support vector machine is trained and tested on two different data sets,...
Embedded dictation, i.e. recognizing vocal commands in noisy environments, with good accuracy and using low complexity implementations is a desirable task with many applications. Such applications include automotive infotainment solutions particularly when no connectivity is available, personal assistants including embedded dictation solutions for disabled people, and so on. This paper reports our...
In this paper, efficiency comparison of Support Vector Machines (SVM) and Binary Support Vector Machines (BSVM) techniques in utterance-based emotion recognition is studied. Acoustic features including energy, Mel-frequency cepstral coefficients (MFCC), Perceptual linear predictive (PLP), Filter bank (FBANK), pitch, their first and second derivatives are used as frame-based features. Four basic emotions...
For the problem low speech recognition rate, an improved method of combining Deep Belief Network (DBN) with support vector machine (SVM) for analyzing Small sample speech signals is proposed. The speech signal data collected as the training sample is used for training the DBN to get the optimal parameter values. The trained DBN is utilized for feature extraction, and these speech sample data signals...
Traditional speech-related identity recognition commonly pays attention to individual aspect of speech signals but in reality, the speech signals are made up of semantics, speaker dependent features, etc. This paper therefore presents a new study that recognizes simultaneously multidimensional speaker information. In order to extract sufficient relational features, both high-level and low-level features...
We propose a neural-network training algorithm that is robust to data imbalance in classification. In our proposed algorithm, weights are introduced to training examples, effectively modifying the trajectory traversed in the parameter space during the learning process. Furthermore, the proposed algorithm would reduce to the normal stochastic gradient decent learning if the data is balanced. On the...
Automatic spoken digit recognition is one of the important areas in speech recognition. Local language spoken digits recognition is the next stage in this technological advancement. This paper presents a new approach for Pashto digits recognition using spectral and prosodic based feature extraction. Very little or almost no work has been done in Pashto spoken digit recognition. Thats why no standard...
Emotions exhibited by a speaker can be detected by analyzing his/her speech, facial expressions and gestures or by combining these properties. This paper concentrates on determining the emotional state from speech signals. Various acoustic features such as energy, zero crossing rate(ZCR), fundamental frequency, Mel Frequency Cepstral Coefficients (MFCCs), etc are extracted for short term, overlapping...
Many pattern recognition problems involve characterizing samples with continuous labels instead of discrete categories. While regression models are suitable for these learning tasks, these labels are often discretized into binary classes to formulate the problem as a conventional classification task (e.g., classes with low versus high values). This methodology brings intrinsic limitations on the classification...
In this paper, we propose an efficient approach to identify the opinion leader from group discussion. This approach is able to recognize the opinion leader without analyzing semantic and syntactic features, which may cost a lot more computing effort. We firstly propose algorithms to evaluate the degree of participation and the emotion expression from the speaking of each member during group discussion...
In this paper, we propose to use a kernel sparse representation based classifier (KSRC) for the task of speech emotion recognition. Further, the recognition performance using the KSRC is improved by imposing a group sparsity constraint. The speech utterances with same emotion may have different duration, but the frame sequence information does not play a crucial role in this task. Hence, in this work,...
The current work presents a multilingual speech-to-text conversion system. Conversion is based on information in speech signal. Speech is the natural and most important form of communication for human being. Speech-To-Text (STT) system takes a human speech utterance as an input and requires a string of words as output. The objective of this system is to extract, characterize and recognize the information...
A ‘weak classifier’ is a classifier that performed badly for many raisons. In general, bad performance can be caused by the highly dimensionality of the data and also the instability of the classifier. Ensemble methods has been developed in order to overcome this problems. The most popular are bagging and Random Subspace Methods (RSM). We propose to use a combination of concepts used in Bagging and...
Speech emotion recognition has become an active topic in pattern recognition. Specifically, support vector machine (SVM) is an effective classifier due to the application of the nonlinear mapping function, which can map the data into high or ever infinite dimensional feature space. However, a single kernel function might not sufficient to describe the different properties of spontaneous speech emotion...
This paper presents the construction of Binary Support Vector Machines and its significance for efficient Speech Emotion Recognition (SER). German Emotional Speech Corpus EmoDB has been used in this study. Seven Binary Support Vector Machines (SVMs) corresponding to each of the seven emotions in the EmoDB, namely Anger-Not Anger, Boredom-Not Boredom, Disgust-Not Disgust, Fear-Not Fear, Happy-Not Happy,...
Emotion recognition from speech helps us in improving the effectiveness of human-machine interaction. This paper presents a method to identify suitable features in DWT domain and improve good accuracy. In this work, 7 emotions (Berlin Database) are recognized using Support Vector Machine (SVM) classifier. Entropy of Teager Energy operated Discrete Wavelet Transform (DWT) coefficients, Linear Predictive...
This paper presents different approaches for developing a speaker recognition system to be used in a voice control interface for an assistive domotic system. Experimental research has been carried out in Matlab, using the Voicebox toolbox for data preparation and feature extraction and using lib SVM for speaker recognition.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.