The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents our recent attempt to make a super-large scale spoken-term detection system, which can detect any keyword uttered in a 2,000-hour speech database within a few seconds. There are three problems to achieve such a system. The system must be able to detect out-of-vocabulary (OOV) terms (OOV problem
This paper considers an unsupervised data selection problem for the training data of an acoustic model and the vocabulary coverage of a keyword search system in low-resource settings. We propose to use Gaussian component index based n-grams as acoustic features in a submodular function for unsupervised data selection
This paper proposes a novel system to automatically determine the sports type of a sports game by conducting keywords spotting on short fragments (around 10 minutes) of a sports game. In this system, we first develop an audio segmentation module as a front-end to separate announcers' speech efficiently from the
In this work, keyword search (KWS) is based on a symbolic index that uses posteriorgram representation of the speech data. For each query, sum-to-one normalization or keyword specific thresholding is applied to the search results. The effect of these methods on the proposed KWS system is investigated. Results are
The paper considers increasing the precision of detection of words in unsupervised keyword spotting method. The method is based on examining signal similarity of two analyzed media description: registered voice and a word (textual query) synthesized by using Text-to-Speech tools. The descriptions of media were given
This paper presents a novel method for deriving patterns for classification of speech sounds. In contrast to conventional methods that attempt to capture time-frequency patterns as represented by spectral envelopes or peaks, our method captures patterns of high-energy tracks, or seams, of maximum “whiteness” across frequency in spectrograms. Our hypothesis is that these seams could potentially carry...
Spoken term detection, especially of out-of-vocabulary (OOV) keywords, benefits from the use of sub-word systems. We experiment with different language-independent approaches to sub-word unit generation, generating both syllable-like and morpheme-like units, and demonstrate how the performance of syllable-like units
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.