The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes a novel algorithm to improve the performance of sparsity based single-channel speech separation(SCSS) problem based on compressed sensing which is an emerging technique for efficient data reconstruction. The conventional approach assumes the mixing conditions and source signals are stationary. For practical applications of audio source separation, however, we face the challenges...
Noise and residual crosstalk are two important issues that have to be addressed in practical applications of underdetermined blind source separation (UBSS) for speech mixture. This paper proposes a noise-robust UBSS algorithm to deal with highly overlapped speech sources with residual crosstalk suppression scheme in the short-time Fourier transform (STFT) domain. The proposed algorithm is firstly...
Determination of the number of sources is a practical issue that has to be addressed in applications of underdetermined blind source separation (UBSS). This paper proposes a noise-robust UBSS algorithm for highly overlapped speech sources in the short-time Fourier transform (STFT) domain. The basic principle of the proposed algorithm firstly estimates the unknown number of sources in time-frequency...
A Discontinuous transmission (DTX) system, which is widely adopted in speech codecs, is an important function for speech communication systems that can reduce the transmission bandwidth by at least a half. Within a DTX system, the comfort noise generation (CNG) plays a key role in the overall quality. Critical performance parameters with respect to the CNG including the transition quality from active...
The recently standardized 3GPP codec for Enhanced Voice Services (EVS) offers new features and improvements for low-delay real-time communication systems. Based on a novel, switched low-delay speech/audio codec, the EVS codec contains various tools for better compression efficiency and higher quality for clean/noisy speech, mixed content and music, including support for wideband, super-wideband and...
In this paper, an automatic speech recognition (ASR) system under ubiquitous environment is proposed, which is successfully implemented in a personalized voice command system under vehicle and living room environment. The proposed ASR system describes a novel scheme of separating speech sources from multi-speakers, detecting speech presence/absence by tracking the higher portion of speech power spectrum...
In this paper, an adaptive voice activity detector (VAD) is proposed, which is successfully implemented in a MFCC based speech recognition system. The proposed VAD describes a novel scheme of detecting speech presence/absence by tracking the higher portion of speech power spectrum and judging the discrimination information. The VAD will adjust judgment threshold adaptively. An automatic speech recognition...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.