Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Most recognition methods, which have shown to be highly efficient under noise-free conditions fail dramatically with S/N ratios around or below 10 dB. One of the consequences of these high noise levels is that most Begin-End Point Detectors fail to separate properly the speech segments of the noise ones. Therefore, the speech recognition mechanisms will not have a clear boundary to start the processing...
We propose a spatial diffuseness feature for deep neural network (DNN)-based automatic speech recognition to improve recognition accuracy in reverberant and noisy environments. The feature is computed in real-time from multiple microphone signals without requiring knowledge or estimation of the direction of arrival, and represents the relative amount of diffuse noise in each time and frequency bin...
Latest smartphones often have more than one microphone in order to perform noise reduction. Although research on speech enhancement is already exploiting this new feature, robust speech recognition is not still benefiting from it. In this paper we propose two feature enhancement methods especially developed for the case of a smartphone with a dual-microphone operating in an adverse acoustic environment...
In this paper, we introduce a newly-created corpus of whispered speech simultaneously recorded via a close-talking microphone and a non-audible murmur (NAM) microphone in both clean and noisy conditions. To benchmark the corpus, which has been freely released recently, experiments on automatic recognition of continuous whispered speech were conducted. When training and test conditions are matched,...
Noise generated due to the motion of a robot deteriorates the quality of the desired sounds recorded by robot-embedded microphones. On top of that, a moving robot is also vulnerable to its loud fan noise that changes its orientation relative to the moving limbs where the microphones are mounted on. To tackle the non-stationary ego-motion noise and the direction changes of fan noise, we propose an...
We describe a speech system for commanding robots in human-occupied outdoor military supply depots. To operate in such environments, the robots must be as easy to interact with as are humans, i.e. they must reliably understand ordinary spoken instructions, such as orders to move supplies, as well as commands and warnings, spoken or shouted from distances of tens of meters. These design goals preclude...
This paper describes a DSP integration of sound source localization (SSL) and multi-channel Wiener filter (MWF). To develop a robot audition system, we integrated SSL module and MWF module into a DSP system. SSL is a module to perceive the direction of a human user's call. It measures time delay of arrival among microphones and estimates the direction of sound source. Also, it post-processes the resulted...
The pioneering work on the `separation of speech from mixture of acoustic sources' dates back to as early as 70s and since then, two main approaches namely traditional approach using signal-processing techniques and computational auditory scene analysis (CASA) approach using auditory-modeling methods have been concurrently attempted by researchers to find solution to the problem of what is known as...
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party conversations recorded in real world environments with background noise that can be used to train noise-robust speech recognition systems. We explain the motivation for creating such a corpus and describe the resulting audio...
In this paper, we propose an acoustic-based head orientation estimation method using a microphone array mounted on a wheelchair, and apply it to a novel interface for controlling a powered wheelchair. The proposed interface does not require disabled people to wear any microphones or utter recognizable voice commands. By mounting the microphone array system on the wheelchair, our system can easily...
This paper explores the problems of speech recognition in a (sometimes) noisy environment. An adaptive acoustic beamformer is proposed based on the Griffiths-Jim method and a "hot-spot" where speech can be received within a geometric defined boundary and rejected outside of it will be shown to give a certain amount of noise immunity and improve the signal-to-noise ratio for the second stage,...
In normal human communication, people face the speaker when listening and usually pay attention to the speakerpsila face. Therefore, in robot audition, the recognition of the front talker is critical for smooth interactions. This paper presents an enhanced speech detection method for a humanoid robot that can separate and recognize speech signals originating from the front even in noisy home environments...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.