The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We explore techniques to improve the robustness of small-footprint keyword spotting models based on deep neural networks (DNNs) in the presence of background noise and in far-field conditions. We find that system performance can be improved significantly, with relative improvements up to 75% in far-field conditions
integrated feature set is obtained after normalization of both sets of features thus obtained. This integrated feature set is used in a Hidden Markov Modeling (HMM) framework along with a novel sliding syllable protocol for keyword spotting. Keyword spotting experiments are conducted on the Hindi language database developed for
The paper proposed a method to realize a speech-to-gesture conversion for communication between normal and speech-impaired people. Keyword spotting was employed to recognize the keywords from input speech signals. At the same time, the three dimensional gesture models of keywords were built by 3D modeling technology
of vocabulary words in the users speech utterance. In this paper, we investigate an approach that can be deployed in keyword spotting systems. We propose a phoneme classifier that will be ultimately used to provide confidence values to be compared against existing Automatic Speech Recognizer word confidences. The end
Retrieving Proper Names (PNs) specific to an audio document can be useful for vocabulary selection and OOV recovery in speech recognition, as well as in keyword spotting and audio indexing tasks. We propose methods to infer and retrieve OOV PNs relevant to an audio news document by using probabilistic topic models
a text database using the filtered results. We further conduct a cache-based adaptation method on the resulting language model, in which keywords in the filtered results are cached and used to boost the word probability. In an experimental evaluation over real lectures, we obtained a significant improvement of ASR
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.