The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Evaluating the accuracy of HMM-based and SVM-based spotters in detecting keywords and recognizing the true place of keyword occurrence shows that the HMM-based spotter detects the place of occurrence more precisely than the SVM-based spotter. On the other hand, the SVM-based spotter performs much better in detecting
In this paper, a new method of Chinese prosodic word tagging is presented. This method consists of a rule-based algorithm named ??keyword anchor?? and a statistical algorithm based on hidden Markov model (HMM). For keyword anchor algorithm, an anchor of the prosodic word is defined to help the system to find the whole
In traditional keyword spotting (KWS) systems, confidence measure (CM) of each keyword is computed from normalized acoustic likelihoods. In addition to likelihood based scores, some keyword dependent features named predictor features such as duration and prosodic features could be defined to improve the performance of
This paper presents a novel method for deriving patterns for classification of speech sounds. In contrast to conventional methods that attempt to capture time-frequency patterns as represented by spectral envelopes or peaks, our method captures patterns of high-energy tracks, or seams, of maximum “whiteness” across frequency in spectrograms. Our hypothesis is that these seams could potentially carry...
actual language identification. On our bi-lingual lecture tasks the PPRLM system clearly outperforms the PPR system in various segment length conditions, however at the cost of slower run-time. By using lexical information in the form of keyword spotting, and additional language models we show ways to improve the
This paper describes ICSI's 2005 speaker recognition system, which was one of the top performing systems in the NIST 2005 speaker recognition evaluation. The system is a combination of four sub-systems: 1) a keyword conditional HMM system, 2) an SVM-based lattice phone n-gram system, 3) a sequential nonparametric
Semantic image retrieval using text such keywords or captions at different semantic levels has attracted considerable research attention in recent years. Automatic image annotation (AIA) has been proved to be an effective and promising solution to automatically deduce the high-level semantics from low-level visual
fusion (MMIF) strategy. Compared with traditional methods, the proposed scheme extracts a wealth of semantic-level features including anchor person, topic caption, face, silence, acoustic change, audio keywords and textual content. Parallel to this, we make use of a multi-modal information fusion strategy for news story
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.