The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the era of deep learning, although beam-forming multi-channel signal processing is still very helpful, it was reported that single-channel robust front-ends usually cannot benefit deep learning models because the layer-by-layer structure of deep learning models provides a feature extraction strategy that automatically derives powerful noise-resistant features from primitive raw data for senone...
Acoustic beamforming has played a key role in the robust automatic speech recognition (ASR) applications. Accurate estimates of the speech and noise spatial covariance matrices (SCM) are crucial for successfully applying the minimum variance distortionless response (MVDR) beamforming. Reliable estimation of time-frequency (TF) masks can improve the estimation of the SCMs and significantly improve...
In this paper we deal with the estimation of the room impulse response (RIR) between each loudspeaker and each microphone of a wireless acoustic network of two nodes when used to implement a crosstalk canceller. The nodes of the network are commercial devices connected via standard wireless links, presenting low computational requirements and non-ideal synchronization between them. Moreover, the nodes...
The impulse response (IR) of an acoustic environment or audio device can be measured by recording its response to a known test signal. Ideally, the same digital clock should be used for playback and recording to ensure synchronous digital-to-analog and analog-to-digital conversion. When measuring the acoustic performance of a hardware device, be it for audio input to a device microphone or audio output...
This paper proposes a novel method for direction-of-arrival (DOA) estimation in azimuth based on spatial sparsity for a wideband acoustic signal using a uniform linear array. The performance of this method is compared with classic subspace based methods such as Root-MUSIC and ESPRIT. In the presence of a multipath reflection, the proposed spatial sparsity based method performs significantly better...
Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly...
This paper introduces a diffusion strategy for the multichannel active noise control. The diffusion strategy is designed to reduce the computational complexity by distributes computations to all nodes of multichannel active noise control system. Thus, the multichannel filtered-x normalized least mean square algorithm, which is the simplest way for real active noise control environments is used as...
In the field of phonetics, voice onset time (VOT) is a major parameter of human speech defining linguistic contrasts in voicing. In this article, a landmark-based method of automatic VOT estimation in acoustic signals is presented. The proposed technique is based on a combination of two landmark detection procedures for release burst onset and glottal activity detection. Robust release burst detection...
The term of “World Englishes” describes the current state of English and one of their main characteristics is a large diversity of pronunciation, called accents. In our previous studies, we developed several techniques to realize effective clustering and visualization of the diversity. For this aim, the accent gap between two speakers has to be quantified independently of extra-linguistic factors...
The freshness of vegetables attracts significant interest, because consumers will determine the way of cooking based on the maturity of the vegetable or select better vegetables in supermarkets based on the freshness information. This paper focuses on tomatoes, and reports our preliminary studies on acoustic probing techniques to estimate their storage term. We hit an acoustic probe that sweeps audible...
This paper proposes a system to convert neutral speech to emotional with controlled intensity of emotions. Most of previous researches considering synthesis of emotional voices used statistical or concatenative methods that can synthesize emotions in categorical emotional states such as joy, angry, sad, etc. While humans sometimes enhance or relieve emotional states and intensity during daily life,...
To ensure a satisfactory QoE (Quality of Experience), it is essential to establish a method that can be used to efficiently investigate recognition performance for spontaneous speech. By using this method, it is allowed to monitor the recognition performance in providing speech recognition services. It can be also used as a reliability measure in speech dialogue systems. Previously, methods for estimating...
In an acoustic partial discharge (PD) detection system, estimation of time difference of arrival (TDOA) between acoustic signals arriving at a sensor array is an important criterion for accurate localization of PD sources inside a transformer. The localization accuracy can be improved by improving the accuracy of estimation of TDOA between sensors. The estimation of TDOA is a challenging task because...
A time-frequency pooled angular spectrum capable of suppressing spatial aliasing effectively is studied for a widely spaced microphone array to estimate multi-source directions of arrival (DOA) in a reverberant environment based on diffuse noise model and time-frequency sparsity of acoustic signals. By using constant false-alarm rate (CFAR) detection technique, only the high-valued elements very likely...
The improved multiband-structured subband adaptive filter (IMSAF) algorithm could enhance the convergence performance of multiband-structured subband adaptive filter algorithms and affine projection. However, the original IMSAF algorithm with a fixed step-size factor have to compromise between convergence rate and steady-state misalignment. A new IMSAF algorithm with variable step-size matrix (VSM)...
This paper describes a non-field-of-view (NFOV) localization approach for a mobile robot in an unknown environment based on an acoustic signal combined with the geometrical information from an optical sensor. The approach estimates the location of a target through the mobile robot's sensor observation frame, which consists of a combination of diffraction and reflection acoustic signals and a 3-D environment...
Multirotor helicopters are expected to be utilized various tasks including rescue missions and surveillance. For those missions, sensors are equipped with helicopters in order to recognize the environment, and auditory information is one of such information that can be utilized to find the target sound source even if it is occluded by objects. One of the difficulty comes from the fact that the noise...
This paper presents a constrained navigation on a Metric Description Graph (MDG) based on the use of a H-Infinity Filter (H-∞) including the restriction on the graph as a fictitious observation. The main goal is to reduce the number of the required ultrasonic beacons for covering an extensive indoor area. This reduction of the localization infrastructure involves an increment of the error in the estimation...
Many multiple narrow-band source detection and DOA methods have been proposed. In the past, we have used an Approximate Maximum-Likelihood (AML) method needing considerable computational complexity for the detection and 3D DOA estimation of multiple broad-band sources. Now, we propose a novel eigen system-based array detection and DOA estimation of multiple broad-band sources with significantly reduced...
A single-station-based three-dimensional (3D) acoustic lighting mapping system comprising a microphone array has been developed and used for lightning observations, in which a new broadband direction-of-arrival (DOA) estimation techniques namely incoherent signal-subspace method are proposed for thunder signals in the far-field. Two cloud-to-ground (CG) flashes with highly branch channels recorded...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.