Text, Speech and Dialogue

chapter

Front Matter

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 563-563

chapter

Feature Subset Selection Based on Evolutionary Algorithms for Automatic Emotion Recognition in Spoken Spanish and Standard Basque Language

Aitor Álvarez, Idoia Cearreta, Juan Miguel López, Andoni Arruti, more

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 565-572

The study of emotions in human-computer interaction is a growing research area. Focusing on automatic emotion recognition, work is being performed in order to achieve good results particularly in speech and facial gesture recognition. In this paper we present a study performed to analyze different Machine Learning techniques validity in automatic speech emotion recognition area. Using a bilingual...

chapter

Two-Dimensional Visual Language Grammar

Siska Fitrianie, Leon J. M. Rothkrantz

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 573-580

Visual language refers to the idea that communication occurs through visual symbols, as opposed to verbal symbols or words. Contrast to a sentence construction in spoken language with a linear ordering of words, a visual language has a simultaneous structure with a parallel temporal and spatial configuration. Inspired by Deikto [5], we propose a two-dimensional string or sentence construction of visual...

chapter

Are You Looking at Me, Are You Talking with Me: Multimodal Classification of the Focus of Attention

Christian Hacker, Anton Batliner, Elmar Nöth

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 581-588

Automatic dialogue systems get easily confused if speech is recognized which is not directed to the system. Besides noise or other people’s conversation, even the user’s utterance can cause difficulties when he is talking to someone else or to himself (“Off-Talk”). In this paper the automatic classification of the user’s focus of attention is investigated. In the German SmartWeb project, a mobile...

chapter

Visualization of Voice Disorders Using the Sammon Transform

Tino Haderlein, Dominik Zorn, Stefan Steidl, Elmar Nöth, more

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 589-596

The Sammon Transform performs data projections in a topology-preserving manner on the basis of an arbitrary distance measure. We use the weights of the observation probabilities of semi-continuous HMMs that were adapted to the current speaker as input. Experiments on laryngectomized speakers with tracheoesophageal substitute voice, hoarse, and normal speakers show encouraging results. Different speaker...

chapter

Task Switching in Audio Based Systems

Melanie Hartmann, Dirk Schnelle

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 597-604

The worker on the move has an ever-increasing need to access information, such as instructions on how to process with a task. The use of audio to convey that information and for interaction has many advantages over traditional hands&eyes devices, especially if the user needs his hands to perform a task. In this paper, we focus on a task model stored in a workflow engine. The execution of a task...

chapter

Use of Negative Examples in Training the HVS Semantic Model

Filip Jurčíček, Jan Švec, Jiří Zahradil, Libor Jelínek

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 605-612

This paper describes use of negative examples in training the HVS semantic model. We present a novel initialization of the lexical model using negative examples extracted automatically from a semantic corpus as well as description of an algorithm for extraction these examples. We evaluated the use of negative examples on a closed domain human-human train timetable dialogue corpus. We significantly...

chapter

Czech-Sign Speech Corpus for Semantic Based Machine Translation

Jakub Kanis, Jiří Zahradil, Filip Jurčíček, Luděk Müller

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 613-620

This paper describes progress in a development of the human-human dialogue corpus for machine translation of spoken language. We have chosen a semantically annotated corpus of phone calls to a train timetable information center. The phone calls consist of inquiries regarding their train traveler plans. Corpus dialogue act tags incorporate abstract semantic meaning. We have enriched a part of the corpus...

chapter

Processing of Requests in Estonian Institutional Dialogues: Corpus Analysis

Mare Koit, Maret Valdisoo, Olga Gerassimenko, Tiit Hennoste, more

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 621-628

The paper analyses, how an information operator processes a customer’s requests. The study is based on the Estonian dialogue corpus. Our further aim is to develop a dialogue system (DS) which interacts with a user in Estonian and recognises, interprets and grants a user’s requests automatically. There are two main classes of computational models of the interpretation of dialogue acts – cue-based and...

chapter

Using Prosody for Automatic Sentence Segmentation of Multi-party Meetings

Jáchym Kolář, Elizabeth Shriberg, Yang Liu

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 629-636

We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data. We examine two different approaches to boundary classification: score-level combination of independent language and prosodic models using HMMs, and feature-level combination of models using a boosting-based method (BoosTexter). We report classification...

chapter

Simple Method of Determining the Voice Similarity and Stability by Analyzing a Set of Very Short Sounds

Konrad Lukaszewicz, Matti Karjalainen

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 637-643

This paper presents a simple method of determining the voice similarity by analyzing a set of very short sounds. A large number of pitch-length sounds were extracted from natural voice signals from different realizations of open vowels ’a’ and ’o’. The voice similarity was defined as the sum of single elementary similarities of short sound pairs. This method is oriented to the microphonemic speech...

chapter

Visualization of Prosodic Knowledge Using Corpus Driven MEMOInt Intonation Modelling

David Escudero-Mancebo, Valentín Cardeñoso-Payo

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 645-652

In this work we show how our intonation corpus driven intonation modelling methodology MEMOInt can help in the graphical visualization of the complex relationships between the different prosodic features which configure the intonational aspects of natural speech. MEMOInt has already been used successfully for the prediction of synthetic F0 contours in the presence of the usual data scarcity problems...

chapter

Automatic Annotation of Dialogues Using n-Grams

Carlos D. Martínez-Hinarejos

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 653-660

The development of a dialogue system for any task implies the acquisition of a dialogue corpus in order to study the structure of the dialogues used in that task. This structure is reflected in the dialogue system behaviour, which can be rule-based or corpus-based. In the case of corpus-based dialogue systems, the behaviour is defined by statistical models which are inferred from an annotated corpus...

chapter

PPChecker: Plagiarism Pattern Checker in Document Copy Detection

NamOh Kang, Alexander Gelbukh, SangYong Han

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 661-667

Nowadays, most of documents are produced in digital format, in which they can be easily accessed and copied. Document copy detection is a very important tool for protecting the author’s copyright. We present PPChecker, a document copy detection system based on plagiarism pattern checking. PPChecker calculates the amount of data copied from the original document to the query document, based on linguistically-motivated...

chapter

Segmental Duration Modelling in Turkish

Özlem Öztürk, Tolga Çiloğlu

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 669-676

Naturalness of synthetic speech highly depends on appropriate modelling of prosodic aspects. Mostly, three prosody components are modelled: segmental duration, pitch contour and intensity. In this study, we present our work on modelling segmental duration in Turkish using machine-learning algorithms, especially Classification and Regression Trees. The models predict phone durations based on attributes...

chapter

A Pattern-Based Methodology for Multimodal Interaction Design

Andreas Ratzka, Christian Wolff

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 677-686

This paper describes a design methodology for multimodal interactive systems. The method suggested is meant to serve as a foundation for the application of robust software engineering techniques in the field of multimodal systems. Starting from a short review of current design approaches we present a high level view of the design process for multimodal systems, highlighting design issues related to...

chapter

A Pattern Learning Approach to Question Answering Within the Ephyra Framework

Nico Schlaefer, Petra Gieselmann, Thomas Schaaf, Alex Waibel

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 687-694

This paper describes the Ephyra question answering engine, a modular and extensible framework that allows to integrate multiple approaches to question answering in one system. Our framework can be adapted to languages other than English by replacing language-specific components. It supports the two major approaches to question answering, knowledge annotation and knowledge mining. Ephyra uses the web...

chapter

Explicative Document Reading Controlled by Non-speech Audio Gestures

Adam J. Sporka, Pavel Žikovský, Pavel Slavík

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 695-702

There are many situations in which listening to a text produced by a text-to-speech system is easier or safer than reading, for example when driving a car. Technical documents, such as conference articles, manuals etc., usually are comprised of relatively plain and unequivocal sentences. These documents usually contain words and terms unknown to the listener because they are full of domain specific...

chapter

Hybrid Neural Network Design and Implementation on FPGA for Infant Cry Recognition

Israel Suaste-Rivas, Alejandro Díaz-Méndez, Carlos A. Reyes-García, Orion F. Reyes-Galaviz

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 703-709

It has been found that the infant’s crying has much information on its sound wave. For small infants crying is a form of communication, a very limited one, but similar to the way adults communicate. In this work we present the design of an Automatic Infant Cry Recognizer hybrid system, that classifies different kinds of cries, with the objective of identifying some pathologies in recently born babies...

chapter

Speech and Sound Use in a Remote Monitoring System for Health Care

Michel Vacher, Jean-François Serignat, Stéphane Chaillol, Dan Istrate, more

Lecture Notes in Computer Science > Text, Speech and Dialogue > Dialogue > 711-718

Ageing affects the economic and social foundations of societies at world level. Health care has to respond to the challenge that population ageing presents. Medical remote monitoring needs human operator to be assisted by means of smart information systems. Physiological and position sensors give numerous data, but speech analysis and sound classification can give interesting additional information...

INFONA - science communication portal

Text, Speech and Dialogue
9th International Conference, TSD 2006, Brno, Czech Republic, September 11-15, 2006. Proceedings

Front Matter

Feature Subset Selection Based on Evolutionary Algorithms for Automatic Emotion Recognition in Spoken Spanish and Standard Basque Language

Two-Dimensional Visual Language Grammar

Are You Looking at Me, Are You Talking with Me: Multimodal Classification of the Focus of Attention

Visualization of Voice Disorders Using the Sammon Transform

Task Switching in Audio Based Systems

Use of Negative Examples in Training the HVS Semantic Model

Czech-Sign Speech Corpus for Semantic Based Machine Translation

Processing of Requests in Estonian Institutional Dialogues: Corpus Analysis

Using Prosody for Automatic Sentence Segmentation of Multi-party Meetings

Simple Method of Determining the Voice Similarity and Stability by Analyzing a Set of Very Short Sounds

Visualization of Prosodic Knowledge Using Corpus Driven MEMOInt Intonation Modelling

Automatic Annotation of Dialogues Using n-Grams

PPChecker: Plagiarism Pattern Checker in Document Copy Detection

Segmental Duration Modelling in Turkish

A Pattern-Based Methodology for Multimodal Interaction Design

A Pattern Learning Approach to Question Answering Within the Ephyra Framework

Explicative Document Reading Controlled by Non-speech Audio Gestures

Hybrid Neural Network Design and Implementation on FPGA for Infant Cry Recognition

Speech and Sound Use in a Remote Monitoring System for Health Care

Filter options

Publication date

Content availability

Publication language

Keywords

INFONA - science communication portal

Text, Speech and Dialogue 9th International Conference, TSD 2006, Brno, Czech Republic, September 11-15, 2006. Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication language

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Text, Speech and Dialogue
9th International Conference, TSD 2006, Brno, Czech Republic, September 11-15, 2006. Proceedings