The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Previous studies on performance evaluation of single-channel speech separation (SCSS) algorithms mostly focused on automatic speech recognition (ASR) accuracy as their performance measure. Assessing the separated signals by different metrics other than this has the benefit that the results are expected to carry on to other applications beyond ASR. In this paper, in addition to conventional speech...
We present new results on single-channel speech separation and suggest a new separation approach to improve the speech quality of separated signals from an observed mixture. The key idea is to derive a mixture estimator based on sinusoidal parameters. The proposed estimator is aimed at finding sinusoidal parameters in the form of codevectors from vector quantization (VQ) codebooks pre-trained for...
The problem of detecting the number of speakers for a particular segment occurs in many different speech applications. In single channel speech separation, for example, this information is often used to simplify the separation process, as the signal has to be treated differently depending on the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed for model selection,...
In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identification block. The improvement is accomplished by incorporating the speaker identities found by the speaker identification block as additional information for the...
We present a novel single-channel separation approach to improve the separation performance while recovering the signals from a mixture. The key idea in this research is to employ a mixture estimator based on unconstrained modified sinusoidal parameters. Compared to the mixmax (binary mask) and Wiener filter (softmask) approaches, the proposed approach works independently of pitch estimates. Furthermore,...
In many signal processing applications including Audio and speech processing as well as other research areas of diagnosis of failures in rotating machinery, finding a compact representation of observations signals is often highly desirable. In this respect, chirplets have recently been introduced as an efficient tool for signal representation. However, determining the required number of chirp atoms...
It is already demonstrated that selected features have a much larger effect to the overall performance in speech applications accuracy than the selected generative models have. In this paper, we propose subband perceptually weighted transformation (SPWT) applied on magnitude spectrum to improve the performance of single-channel separation scenario (SCSS). In particular, we compare three feature types...
In this paper, we address the problem of monaural music and speech separation, based on soft mask filtering. Likewise other well-known techniques, the estimation of statistical model of the sources are needed. Hence, we employ Vector quantization (VQ) for synthesis stage which results in more accurate codebook entries for each source in contrast to the commonly used GMM (Gaussian Mixture Model) approach...
In this paper, we present proofs for optimum mixture estimator for mixture estimator for single-channel speech separation (SSCS) problem. We demonstrate that by replacing the proposed optimum estimator with mixture-maximization (Mixmax) or Quadratic estimators, it is possible to reach at a lower estimation error while separating mixture of speech signals. In addition, the proposed estimator results...
One of the most important objectives in mobile communication systems is secure data communication (including text, picture, video and voice) especially, in high bit rate. For this reason, in this paper, a new procedure is proposed in which the intended data or voice is modulated onto speech-like waveforms; Then the modulated waveforms are transmitted over the global system for mobile communications...
One of the most important objectives in mobile communication systems is secure voice and data communication (including text, picture, video and voice) esp. in high bit rates. In this paper, a new procedure is proposed in which the intended data or voice is encrypted and modulated onto speech-like waveforms. The modulated waveforms are transmitted over the global system for mobile communications (GSM)...
In many speech separation and enhancement techniques, establishing a statistical model like a Vector Quantization (VQ) is a must to handle the so-called model-based approaches. It is also desirable to establish a trade-off between sparsity and accuracy in the quantizer. To do so, in this paper we present split-VQ for sinusoidal parameters. We observed that sinusoidal parameters including amplitudes...
In this paper, two new noise reduction algorithms are presented which are robust to the mismatch problem commonly exist between the ideal array look direction and actual speaker's Direction-Of-Arrival (DOA). Deriving a set of leakage constraints and incorporating them into the state-of-the-art Generalized Sidelobe Canceller (GSC) algorithm leads to two new noise reduction algorithms, namely Constrained...
In this paper, a new joint structure for noise reduction in a reverberant environment will be proposed. The proposed structure consists of an acoustic echo canceller (AEC) followed by a noise reduction stage like generalized side-lobe canceller (GSC). This configuration is called AEC-GSC. It improves noise cancellation of the GSC beamformer in the presence of acoustic echoes in highly reverberant...
In this paper, the segment proportionate variable-step-size normalized least mean square (SPVS-NLMS) algorithm is proposed. Using computer simulations, we show that the proposed SPVS-NLMS algorithm performs a faster convergence compared to the segment proportionate normalized least mean square algorithm by Hongyang and Doroslovacki (2005) with a higher tracking ability for speech signals
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.