The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic and spontaneous speech emotion recognition is an important part of a human-computer interactive system. However, emotion identification in spontaneous speech is difficult because most often the emotion expressed by the speaker are not necessarily as prominent as in acted speech. In this paper, we propose a spontaneous speech emotion recognition framework that makes use of the associated...
Use of the error correcting codes (ECC) in a multiclass audio emotion recognition problem is proposed to improve the emotion recognition accuracy. We visualize the emotion recognition system as a noisy communication channel, thus motivating the use of ECC. We assume the emotion recognition process consists of an audio feature extractor followed by an artificial neural network (ANN) for emotion classification...
We propose the use of the popular error correcting codes (ECC) in a multi-class audio emotion recognition scenario to improve the emotion recognition accuracy in spoken speech. In this paper, we visualize the emotion recognition system as a noisy communication channel, thus motivating the use of ECC in the emotion recognition process. We assume the emotion recognition process consists of an audio...
Automatic speech emotion recognition plays an important role in intelligent human computer interaction. Identifying emotion in natural, day to day, spontaneous conversational speech is difficult because most often the emotion expressed by the speaker are not necessarily as prominent as in acted speech. In this paper, we propose a novel spontaneous speech emotion recognition framework that makes use...
Estimating emotion from speech is an active and ongoing area of research, however most of the literature addresses acted speech and not natural day to day conversational speech. Identifying emotion from the latter is difficult because the emotion expressed by non-actors is not necessarily prominent. In this paper we validate the hypothesis, which is based on the observations that human annotators...
Acoustic source localization and sound recognition are common acoustic scene analysis tasks that are usually considered separately. In this paper, a new source localization technique is proposed that works jointly with an acoustic event detection system. Given the identities and the end-points of simultaneous sounds, the proposed technique uses the statistical models of those sounds to compute a likelihood...
The analysis of acoustic scenes requires several functionalities, being perhaps recognition (speech, speaker, other acoustic events) and spatial localization the two most relevant ones. For a reduced invasiveness, the microphones are far away from the sound sources, and possibly grouped in arrays, which may be distributed, not arranged, in the room. Aiming at an increased performance, the usual model-based...
When several acoustic sources are simultaneously active in a meeting room scenario, and both the position of the sources and the identity of the time-overlapped sound classes have been estimated, the problem of assigning each source position to one of the sound classes still remains. This problem is found in the real-time system implemented in our smart-room, where it is assumed that up to two acoustic...
Time overlapping of acoustic signals, which so often occurs in real life, is a challenge for current state-of-the-art sound recognition systems. In this work, we propose an approach for detecting, identifying and positioning a set of simultaneous acoustic events in a room environment, using multiple arbitrarily-located microphone arrays, and working in real time. Assuming a set of estimated acoustic...
Speech recognition systems that make use of statistical classifiers require a large number of training samples. However, collection of real samples has always been a difficult problem due to the involvement of substantial amount of human intervention and cost. Considering this problem, this paper presents a novel method for generating synthetic samples from a handful of real samples and investigates...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.