The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Musicians often have the following problem: they have a music score that requires 2 or more players, but they have no one with whom to practice. So far, score-playing music robots exist, but they lack adaptive abilities to synchronize with fellow players' tempo variations. In other words, if the human speeds up their play, the robot should also increase its speed. However, computer accompaniment systems...
Predictability is an important factor for determining robot motions. This paper presents a model to generate robot motions based on reliable predictability evaluated through a dynamics learning model which self-organizes object features. The model is composed of a dynamics learning module, namely Recurrent Neural Network with Parametric Bias (RNNPB), and a hierarchical neural network as a feature...
This paper describes a speedup and performance improvement of multi-channel semi-blind ICA (MCSB-ICA) with parallel and resampling-based block-wise processing. MCSB-ICA is an integrated method of sound source separation that accomplishes blind source separation, blind dereverberation, and echo cancellation. This method enables robots to separate user's speech signals from observed signals including...
This paper presents a novel synchronizing method for a human-robot ensemble using coupled oscillators. We define an ensemble as a synchronized performance produced through interactions between independent players. To attain better synchronized performance, the robot should predict the human's behavior to reduce the difference between the human's and robot's onset timings. Existing studies in such...
In real-world situations, a robot may often encounter “under-determined” situation, where there are more sound sources than microphones. This paper presents a speech separation method using a new constraint on the harmonic structure for a simultaneous speech-recognition system in under-determined conditions. The requirements for a speech separation method in a simultaneous speech-recognition system...
We describe integration of preprocessing and automatic speech recognition based on Missing-Feature-Theory (MFT) to recognize a highly interfered speech signal, such as the signal in a narrow angle between a desired and interfered speakers. As a speech signal separated from a mixture of speech signals includes the leakage from other speech signals, recognition performance of the separated speech degrades...
This paper presents voice-quality control of humanoid robots based on a new model of spectral envelope modification corresponding to the vertical head motions, and left-right sound-pressure modulation corresponding to the horizontal head motions. We assume that a pitch-axis rotation, or a vertical head motion, and a yaw-axis rotation, or a horizontal head motion, affect the voice quality independently...
This paper presents an ICA-based robot audition system which estimates the reverberation time of the environment automatically by using the robot's own speech. The system is based on multi-channel semi-blind independent component analysis (MCSB-ICA), a source separation method using a microphone array that can separate user and robot speech under reverberant environments. Perception of the reverberation...
A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces...
We aim at developing a singer robot capable of listening to music with its own ??ears?? and interacting with a human's musical performance. Such a singer robot requires at least three functions: listening to the music, understanding what position in the music is being performed, and generating a singing voice. In this paper, we focus on the second function, that is, the capability to align an audio...
This paper describes a step-size parameter adaptation technique of multi-channel semi-blind independent component analysis (MCSB-ICA) for a ??barge-in-able?? robot audition system. By ??barge-in??, we mean that the user can speak simultaneously when the robot is speaking.We focused on MCSB-ICA to achieve such an audition system because it can separate a user's and a robot's speech under reverberant...
A humanoid robot must recognize a target speech signal while people around the robot chat with them in real-world. To recognize the target speech signal, robot has to separate the target speech signal among other speech signals and recognize the separated speech signal. As separated signal includes distortion, automatic speech recognition (ASR) performance degrades. To avoid the degradation, we trained...
This paper presents novel methods for support for browsing a long meeting record towards supporting public involvement. Facilitating public involvement in the consensus building process for community development needs a lot of effort and time for sharing context and concerns among citizens and stakeholders. A record of public meeting often becomes too long to overview and to understand for people...
Reliable predictability is one of the main factors that determine human behaviors. The authors developed a model that searches and generates robot motions based on reliable predictability. Training of the model consists of three phases. In the first phase, the model trains a sequential learner, namely recurrent neural network with parametric bias, to self-organize robot and object dynamics. In the...
A continuous vocal imitation system was developed using a computational model that explains the process of phoneme acquisition by infants. Human infants perceive speech sounds not as discrete phoneme sequences but as continuous acoustic signals. One of critical problems in phoneme acquisition is the design for segmenting these continuous speech sounds. The key idea to solve this problem is that articulatory...
This paper proposes a model that enables a robot to predict and imitate the motions of another by reusing its body forward-inverse model. Our model includes three approaches: (i) projection of a self-forward model for predicting phenomena in the external environment (other individuals), (ii) ldquotriadic relationrdquo that is mediation by a physical object between self and others, (iii) introduction...
This paper describes a new method that allows ldquoBarge-Inrdquo in various environments for robot audition. ldquoBarge-inrdquo means that a user begins to speak simultaneously while a robot is speaking. To achieve the function, we must deal with problems on blind dereverberation and echo cancellation at the same time. We adopt Independent Component Analysis (ICA) because it essentially provides a...
This paper presents the design and implementation of 3D Auditory Scene Visualizer based on the visual information seeking mantra, ``overview first, zoom and filter, then details on demand''. The machine audition system called HARK captures 3D sounds with a microphone array.The natural language processing called SalienceGraph visualizes topic transition by using discourse salience. The 3D visualizer...
If machine audition can recognize an auditory scene containing simultaneous and moving talkers, what kinds of awareness will people gain from an auditory scene visualizer? This paper presents the design and implementation of 3D Auditory Scene Visualizer based on the visual information seeking mantra, i.e., ldquooverview first, zoom and filter, then details on demandrdquo. The machine audition system...
In normal human communication, people face the speaker when listening and usually pay attention to the speakerpsila face. Therefore, in robot audition, the recognition of the front talker is critical for smooth interactions. This paper presents an enhanced speech detection method for a humanoid robot that can separate and recognize speech signals originating from the front even in noisy home environments...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.