Search results

chapter

Temporal smearing compensation in reverberant environment for speech-based human-robot interaction

Randy Gomez, Keisuke Nakamura, Takeshi Mizumoto, Kazuhiro Nakadai

2015 IEEE International Conference on Robotics and Automation (ICRA) > 3347 - 3353

2015 IEEE International Conference on Robotics and Automation (ICRA)

Speech-based human-robot interaction is often plagued with issues such as reverberation and changes in speaker position that impacts overall performance. In this paper, we show a method in compensating the joint effects of reverberation and the change in speaker position. The acoustic perturbation caused by these two takes its toll on the Automatic Speech Recognition (ASR) and then the Spoken Language...

chapter

Home-environment adaptation of phoneme factorial hidden Markov models

Agnieszka Betkowska, Koichi Shinoda, Sadaoki Furui

2007 15th European Signal Processing Conference > 2380 - 2384

2007 15th European Signal Processing Conference

We focus on the problem of speech recognition in the presence of nonstationary sudden noise, which is very likely to happen in home environments. To handle this problem, a model compensation method based on a factorial hidden Markov model (FHMM) has been recently introduced. In this architecture, speech and noise processes are modeled in parallel by a phoneme FHMM that is built by combining a clean-speech...

chapter

Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures

Antoine Deleforge, Walter Kellermann

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 355 - 359

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We propose a novel sparse representation for heavily underdetermined multichannel sound mixtures, i.e., with much more sources than microphones. The proposed approach operates in the complex Fourier domain, thus preserving spatial characteristics carried by phase differences. We derive a generalization of K-SVD which jointly estimates a dictionary capturing both spectral and spatial features, a sparse...

chapter

Robot audition: Its rise and perspectives

Hiroshi G. Okuno, Kazuhiro Nakadai

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5610 - 5614

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The ability of robots to listen to several things at once with their own “ears”, that is, robot audition, is an important factor in improving interaction and symbiosis between humans and robots. The critical issue in robot audition is real-time processing and robustness against noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper first...

chapter

Micbots: Collecting large realistic datasets for speech and audio research using mobile robots

Jonathan Le Roux, Emmanuel Vincent, John R. Hershey, Daniel P.W. Ellis

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) > 5635 - 5639

ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. Large benchmark datasets for automatic speech recognition (ASR) have been instrumental in the advancement of speech recognition technologies. However, when it comes to robust ASR, source separation, and localization, especially using microphone arrays, the perfect dataset is out of reach, and...

chapter

Concepts and simulations of a soft robot mimicking human tongue

Xuanming Lu, Weiliang Xu, Xiaoning Li

2015 6th International Conference on Automation, Robotics and Applications (ICARA) > 332 - 336

2015 6th International Conference on Automation, Robotics and Applications (ICARA 2015)

The structure of a novel soft robot which can mimic a few movements of human tongue was designed with a series of embedded chambers using pneumatic actuation pattern. Two silicone materials (Ecoflex 0030 and PDMS) were chosen to fabricate the body of the robot. FEM simulations have been carried out using software Abaqus. Four types of deformation have been achieved in simulation including roll, groove,...

chapter

Development of Small Footprint Korean Large Vocabulary Speech Recognition for Commanding a Standalone Robot

Donghyun Lee, Minkyu Lim, Myoung-Wan Koo, Jungyun Seo, more

2014 Ninth International Conference on Broadband and Wireless Computing, Communication and Applications > 536 - 540

2014 Ninth International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA)

The work in this paper concerns a small footprint Acoustic Model (AM) and its use in the implementation of a Large Vocabulary Isolated Speech Recognition (LVISR) system for commanding a robot in the Korean language, which requires about 500KB of memory. Tree-based state clustering was applied to reduce the number of total unique states, while preserving its original performance. A decision tree induction...

chapter

iARBook: An Immersive Augmented Reality system for education

Mhd Wael Bazzaza, Buti Al Delail, M. Jamal Zemerly, Jason W.P. Ng

2014 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE) > 495 - 498

2014 International Conference of Teaching, Assessment and Learning (TALE)

The advancement in technology nowadays has improved learning methods that are beginning to override the traditional methods. Augmented Reality (AR) is one such technology that has seen many applications in education. This paper describes how an Immersive Augmented Reality (iAR) application in conjunction with a book, can act as a new smart learning method by engaging as many of the user's senses and...

chapter

ExpressionBot: An emotive lifelike robotic face for face-to-face communication

Ali Mollahosseini, Gabriel Graitzer, Eric Borts, Stephen Conyers, more

2014 IEEE-RAS International Conference on Humanoid Robots > 1098 - 1103

2014 IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids 2014)

This article proposes an emotive lifelike robotic face, called ExpressionBot, that is designed to support verbal and non-verbal communication between the robot and humans, with the goal of closely modeling the dynamics of natural face-to-face communication. The proposed robotic head consists of two major components: 1) a hardware component that contains a small projector, a fish-eye lens, a custom-designed...

chapter

An articulated talking face for the iCub

Alberto Parmiggiani, Marco Randazzo, Marco Maggiali, Frederic Elisei, more

2014 IEEE-RAS International Conference on Humanoid Robots > 1 - 6

2014 IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids 2014)

Recent developments in human-robot interaction show how the ability to communicate with people in a natural way is of great importance for artificial agents. The implementation of facial expressions has been found to significantly increase the interaction capabilities of humanoid robots. For speech, displaying a correct articulation with sound is mandatory to avoid audiovisual illusions like the McGurk...

chapter

A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game

Izaya Nishimuta, Naoki Hirayama, Kazuyoshi Yoshii, Katsutoshi Itoyama, more

2014 IEEE-RAS International Conference on Humanoid Robots > 967 - 972

2014 IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids 2014)

This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says "Yes" first gets a right to answer a question, and players are allowed to barge in a questionary utterance of...

chapter

Gesture-based attention direction for a telepresence robot: Design and experimental study

Keng Peng Tee, Rui Yan, Yuanwei Chua, Zhiyong Huang, more

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems > 4090 - 4095

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014)

The application of robotics to telepresence can enhance user interaction experience by providing embodiment, engaging behaviors, automatic control, and human perception. This paper presents a new telepresence robot with gesture-based attention direction to orient the robot towards attention targets according to human deictic gestures. Gesture-based attention direction is realized by combining Localist...

chapter

Making a robot dance to diverse musical genre in noisy environments

Joao Lobato Oliveira, Keisuke Nakamura, Thibault Langlois, Fabien Gouyon, more

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems > 1896 - 1901

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014)

In this paper we address the problem of musical genre recognition for a dancing robot with embedded microphones capable of distinguishing the genre of a musical piece while moving in a real-world scenario. For this purpose, we assess and compare two state-of-the-art musical genre recognition systems, based on Support Vector Machines and Markov Models, in the context of different real-world acoustic...

chapter

Improving speech emotion recognition system for a social robot with speaker recognition

Lukasz Juszkiewicz

2014 19th International Conference on Methods and Models in Automation and Robotics (MMAR) > 921 - 925

2014 19th International Conference on Methods & Models in Automation & Robotics (MMAR)

This paper presents modification of a speech emotion recognition system for a social robot. Using speaker dependent classifiers with prior speaker identification step was proposed. Emotion recognition is done using global acoustic features of the speech. Six speech signal parameters are computed with the specialised software. The feature extraction is based on calculation of global statistics of those...

chapter

An adaptive microphone array topology for target signal extraction with humanoid robots

Hendrik Barfuss, Walter Kellermann

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC) > 16 - 20

2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)

In this paper, an unsupervised adaptation algorithm for the microphone array topology of a humanoid robot is proposed, so that the spatial filtering performance is improved. In the given exemplary case, the target suppression (‘blocking’) performance of a geometrically-constrained BSS (GC-BSS) algorithm is shown to improve by the adaptation of the array topology. As a decisive feature, an online performance...

chapter

Guiding computational perception through a shared auditory space

E. Martinson, V. Yalla

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems > 3156 - 3161

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014)

Blind or visually impaired people want to know more about things they hear in the world. They want to know what other people can “see”. With its cameras, a robot can fill that role. But how can an individual make requests about arbitrary objects they can only hear? How can people make requests about objects they do not know either the exact location of, or any uniquely identifiable traits? This work...

chapter

Development of an interaction-activated communication model based on a heat conduction equation in voice communication

Yoshihiro Sejima, Tomio Watanabe, Mitsuru Jindai

The 23rd IEEE International Symposium on Robot and Human Interactive Communication > 832 - 837

2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication

In a previous study, we developed an embodied virtual communication system for human interaction analysis by synthesis in avatar-mediated communication and confirmed the close relationship between speech overlap and the period for activating embodied interaction and communication through avatars. In this paper, we propose an interaction-activated communication model based on the heat conduction equation...

chapter

Handling uncertain input in multi-user human-robot interaction

Simon Keizer, Mary Ellen Foster, Andre Gaschler, Manuel Giuliani, more

The 23rd IEEE International Symposium on Robot and Human Interactive Communication > 312 - 317

2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication

In this paper we present results from a user evaluation of a robot bartender system which handles state uncertainty derived from speech input by using belief tracking and generating appropriate clarification questions. We present a combination of state estimation and action selection components in which state uncertainty is tracked and exploited, and compare it to a baseline version that uses standard...

chapter

Conveying emotion in robotic speech: Lessons learned

Joe Crumpton, Cindy Bethel

The 23rd IEEE International Symposium on Robot and Human Interactive Communication > 274 - 279

2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication

This research explored whether robots can use modern speech synthesizers to convey emotion with their speech. We investigated the use of MARY, an open source speech synthesizer, to convey a robot's emotional intent to novice robot users. The first experiment indicated that participants were able to distinguish the intended emotions of anger, calm, fear, and sadness with success rates of 65.9%, 68...

chapter

Neurotypical and autistic children aged 6 to 7 years in a speaker-listener situation with a human or a minimalist InterActor robot

Irini Giannopulu, Valerie Montreynaud, Tomio Watanabe

The 23rd IEEE International Symposium on Robot and Human Interactive Communication > 942 - 948

2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication

Language offers the possibility to transfer information between speaker and listener who both possess the ability to use it. Using a “speaker-listener” situation, we have compared the verbal and the emotional expressions of neurotypical and autistic children aged 6 to 7 years. The speaker was always a child (neurotypical or autistic); the listener was a human InterActor or an InterActor robot, i.e...

INFONA - science communication portal

Search results

Temporal smearing compensation in reverberant environment for speech-based human-robot interaction

Home-environment adaptation of phoneme factorial hidden Markov models

Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures

Robot audition: Its rise and perspectives

Micbots: Collecting large realistic datasets for speech and audio research using mobile robots

Concepts and simulations of a soft robot mimicking human tongue

Development of Small Footprint Korean Large Vocabulary Speech Recognition for Commanding a Standalone Robot

iARBook: An Immersive Augmented Reality system for education

ExpressionBot: An emotive lifelike robotic face for face-to-face communication

An articulated talking face for the iCub

A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game

Gesture-based attention direction for a telepresence robot: Design and experimental study

Making a robot dance to diverse musical genre in noisy environments

Improving speech emotion recognition system for a social robot with speaker recognition

An adaptive microphone array topology for target signal extraction with humanoid robots

Guiding computational perception through a shared auditory space

Development of an interaction-activated communication model based on a heat conduction equation in voice communication

Handling uncertain input in multi-user human-robot interaction

Conveying emotion in robotic speech: Lessons learned

Neurotypical and autistic children aged 6 to 7 years in a speaker-listener situation with a human or a minimalist InterActor robot

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options