The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Use of modern technological advances in real-time biomedical analysis is very crucial. Current work focuses on glottal pathology discrimination based on non-invasive speech analysis techniques. Primary set back in developing such method is irregular performance depreciation of several state of the art acoustic features. To excuse such problems, we have used glottal to noise excitation ratio, which...
This paper presents a signal processing technique for segmenting short speech utterances into unvoiced and voiced sections and identifying points where the spectrum becomes steady. The segmentation process is part of a system for deriving musculoskeletal articulation data from disordered utterances, in order to provide training feedback for people with speech articulation problem. The approach implement...
This paper describes speaker localization and speech detection techniques for domestic environments. In real environments, it is hard to localize speakers because reverberation causes discrepancy from the simple spherical wave assumption. We propose a template-based method that calibrates the localization errors included in conventional methods. In addition, we use statistical speech detection methods...
Audio plays an important role among information sources in our life. As a result of current technology, it is available to record people's huge personal life activities as life-logs from long-term and multi-dimensional point of view on portable device. In order to make record be effective, this paper focuses on implementing the classification among speech, music and other kinds of sound around, which...
This paper presents the parameterization of speech based on amplitude and frequency modulation (AM-FM) model and its application to speaker identification. Speech parameterization is based on three different bandwidths. The speaker identification is done using auto associative neural network. The AANN is trained with SOLO speaking style speech signal, and a network is created for each speaker. The...
Endpoint detection is the preliminary job of speech signal processing, it is vital to speech recognition. Most of recent endpoint detection algorithms will give a satisfied result at high SNRs (signal-to-noise ratio), while they might fail in occasion where the noise level is too excessive. In this paper, a novel endpoint detection algorithm based on 12-order MFCC and spectral entropy in the framework...
The present paper is investigating the modelling of the McGurk effect, an audio-visual speech perceptual illusion, with a distributed model of memory. The network is trained with congruent auditory and visual patterns and tested with incongruent sets of patterns considered to produce the McGurk effect.
Speech sifnal is very unpredictable, inconsistent and produces various curves each time it is plotted. This makes speech sifnal a very challenfinf field in terms of its reception, modulation, demodulation, filterinf and processinf. So it very difficult to formulate a feneralized mathematical model for such a sifnal. However it can be shown that flexible systems like Neural Networks can be implemented...
Voice activity detection (VAD) is a fundamental part of speech processing. Combination of multiple acoustic features is an effective approach to make VAD more robust against various noise conditions. There have been proposed several feature combination methods, in which weights for feature values are optimized based on minimum classification error (MCE) training. We improve these MCE-based methods...
We propose an efficient and effective nonlinear feature domain noise suppression algorithm, motivated by the minimum mean square error (MMSE) optimization criterion. Multi layer perceptron (MLP) neural network in the log spectral domain minimizes the difference between noisy and clean speech. By using this method as a pre-processing stage of a speech recognition system, the recognition rate in noisy...
To identify the English pronunciation errors made by Chinese learners, this paper utilizes uni-directional microphones to construct a superdirective beamformer for capturing high quality input speech, and integrates the techniques of anti-model and confidence measure into the speech recognizer for accurate identification of the speaker's pronunciation errors. As to the beamformer, although designing...
This paper presents a description of the principal aspects employed in the development of a speaker verification system based on a Spanish corpus. The main goal is to obtain classification results and behavior using Support Vector Machines (SVM) as the classifier technique. The most relevant aspects involved in developing a Spanish corpus are given. For the front end processing a novel method to suppress...
After study on the robust optimization of speech recognition system, we propose an improved wavelet thresholds de-noising method and combine it with the temporal filter to pre-enhance the noisy speech signals before recognition, which leads to good results. Then a hybrid model of hidden Markov and BP neural network is proposed, using BP to get the HMM (hidden Markov model) observation probability,...
This paper proposes an eigenvalue decomposition algorithm for robust time-delay estimation (TDE) based on triple microphone in situations where reverberation and spatial noise are present. This algorithm regards time delay estimation with spatial noise and reverberation as a blind channel identification issue of a double-input triple-output system, and use lag correlation matrix to reduce the spatial...
With more and more miniature speech communication devices coming out, two-element microphone array draws a lot of attention due to its simplicity and the ability to suppress directional noise. This paper develops a dual channel speech enhancement method by combining the first- order differential microphone (FDM) array and the single-channel spectral enhancement techniques. The method can obtain an...
The theoretic foundation of traditional microphone array post-filters is the assumption that the noise between sensors is uncorrelated. However, this assumption is inaccurate in real environments since the correlated noise exists. In this paper, a generalized microphone array post-filter is proposed to deal with both the correlated and uncorrelated noise in environments and a novel perceptual filter...
This paper addresses the problem of robust speech endpoint detection in aircraft cockpit voice background. The proposed method described in this paper is based on a statistical model approach. Based on the voice background characteristics analysis, the complex Laplacian distribution model that directly aim at noisy speech is established; then the likelihood ratio test (LRT) based on binary hypothesis...
Adaptive noise cancellers (ANCs) do not provide sufficient noise reduction in the diffuse noise fields. In this paper, a new hybrid structure is proposed as a solution to this problem. The proposed system is a combination of two subsystems, an ANC and a new multistage post-filter. The post-filter is based on linear prediction (LP) and attempts to extract speech component by using intermediate ANC...
In this work, we present a new mask estimation technique that uses a neural network classifier to determine the reliability of spectrographic elements. In addition some different kinds of features used for classification were compared that make no assumptions about the corrupting noise signal, but rather exploit spectrographic characteristics of the speech signal. The performance of the proposed method...
This paper deals with the problem of Adaptive Noise Cancellation (ANC) for the speech signal corrupted with an additive white Gaussian noise. After explaining the least Mean Square (LMS)-based adaptive filter and Kalman filter, we examine the hybrid Kalman-based LMS (KLMS) technique for adaptation of the ANC. The proposed technique suggests a way to normalize LMS algorithm using Kalman filter. Our...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.