The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Sub-band speech processing is well-known in robust speech recognition. On the other hand, in recent years, deep neural networks (DNNs) have been widely used in speech recognition for acoustic modeling and also feature extraction and transformation. In this paper, we propose to use deep belief network (DBN) as a post-processing method for de-noising in Mel sub-band level where we enhance logarithm...
In recent years, sub-band speech recognition has been found useful in robust speech recognition, especially for speech signals contaminated by band-limited noise. In sub-band speech recognition, full band speech is divided into several frequency sub-bands and then sub-band feature vectors or their generated likelihoods by corresponding sub-band recognizers are combined to give the result of recognition...
A solution for separating speech from music signal as a single channel source separation is Non-negative Matrix Factorization (NMF). In this approach spectrogram of each source signal is factorized as multiplication of two matrices which are known as basis and weight matrices. To achieve proper estimation of signal spectrogram, weight and basis matrices are updated iteratively. To estimate distance...
Language model (LM) is essential for speech recognition systems. Efficiency of this model depends on its adaptation to the linguistic characteristics. According to this, adaptation methods attempt to use syntactic and semantic features for language modelling. The previous adaptation methods such as family of Dirichlet class language model (DCLM) exploit class of history words. These methods due to...
Ideal binary mask speech enhancement is shown to increase the speech quality as well as speech intelligibility. But, this property depends highly on the accurate separation of speech and masker time-frequency units of the input spectrum, which is a difficult task in real situations. Ordinary binary mask methods are single-microphone methods and so, can obtain little information from the environment...
Keyword spotting refers to detection of all occurrences of any given word in a speech utterance. In this paper, we define the keyword spotting problem as a binary classification problem and propose a discriminative approach for solving it. Our approach exploits evolutionary algorithm to determine the separating hyper plane between two classes: class of sentences containing the target keywords and...
Filtering approaches in spectral domain and features domain have been shown their effectiveness for robust speech recognition. In this paper, we propose a two step filtering method. In the first step, spectral subtraction filter is applied to speech spectrum. In the second step, we design a temporal structure normalization filter in order to apply to features extracted from the filtered spectrum....
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.