A hierarchical framework for modeling multimodality and emotional evolution in affective dialogs

Angeliki Metallinou; Athanasios Katsamanis; Shrikanth Narayanan

doi:10.1109/ICASSP.2012.6288399

A hierarchical framework for modeling multimodality and emotional evolution in affective dialogs

Metallinou, Angeliki, Katsamanis, Athanasios, Narayanan, Shrikanth

Source

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 2401 - 2404

Abstract

Incorporating multimodal information and temporal context from speakers during an emotional dialog can contribute to improving performance of automatic emotion recognition systems. Motivated by these issues, we propose a hierarchical framework which models emotional evolution within and between emotional utterances, i.e., at the utterance and dialog level respectively. Our approach can incorporate a variety of generative or discriminative classifiers at each level and provides flexibility and extensibility in terms of multimodal fusion; facial, vocal, head and hand movement cues can be included and fused according to the modality and the emotion classification task. Our results using the multimodal, multi-speaker IEMOCAP database indicate that this framework is well-suited for cases where emotions are expressed multimodally and in context, as in many real-life situations.

Identifiers

book ISSN :	1520-6149
book e-ISSN :	1520-6149
book ISBN :	978-1-4673-0045-2
book e-ISBN :	978-1-4673-0046-9 , 978-1-4673-0044-5
DOI	10.1109/ICASSP.2012.6288399