The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Training a bottleneck feature (BNF) extractor with multilingual data has been common in low resource keyword search. In a low resource application, the amount of transcribed target language data is limited while there are usually plenty of multilingual data. In this paper, we investigated two methods to train
In this paper we describe the 2016 BBN conversational telephone speech keyword spotting system; the culmination of four years of research and development under the IARPA Babel program. The system was constructed in response to the NIST Open Keyword Search (OpenKWS) evaluation of 2016. We present our technological
In this paper, a method of automatic Chinese keyword extraction based on KNN is proposed. Firstly, it preprocesses the document by vector space model. Secondly, it constructs a set of candidate keywords based on KNN method and the labeled dataset. Finally, it post-processes on candidate keywords by the character of
Keywords normally carry large amount of category information. In order to fully utilize this kind of information for text classification, this paper proposes a new text feature conversion method based on the SKG model. The method uses the classified texts with the listed key words as the training data to train the
Multilingual (ML) representations play a key role in building speech recognition systems for low resource languages. The IARPA sponsored BABEL program focuses on building speech recognition (ASR) and keyword search (KWS) systems in over 24 languages with limited training data. The most common mechanism to derive ML
In this paper, we describe the use of a Boosting algorithm, Real AdaBoost, for content-based image retrieval (CBIR) on a large number (190) of keyword categories. Previous work with Boosting for image orientation detection has involved only a few categories, such as a simple outdoor vs. indoor scene dichotomy. Other
characteristics, are playing an important role in user indexing, personalized recommendation, and so on. Previous works apply keyword extraction methods to present the interests of users. However, it is hard for keyword extraction to give accurate results when the data is deficient and noisy. In this paper, we propose a novel method
cross-domain expert classification model in medical community, we combine user's inherent information as the keyword together with user's potential information, thereby improving the cross-domain model of medical community expert classification. Through the data collected in the medical community, our experimental results
implement the proposed method in three ways that consist of keyword matching designed by hand, machine learning and hybrid of them. Besides, we evaluate classification performance using typical five kinds of event categories. As a result, we confirmed the method of the hybrid has highest average F-score 0.674 in the methods.
Abstract-By analyzing the process of classification and MapReduce computing paradigms, it is found that the parallel and distributed computing model in MapReduce is appropriate for constructing classifier model. This paper presents a MapReduce algorithm for parallel and distributed classification, aiming to reduce the computational time in training process on large scale documents. Our experiment...
Automatic image annotation is a promising methodology for image retrieval. However most current annotation models are not yet sophisticated enough to produce high quality annotations. Given an image, some irrelevant keywords to image contents are produced, which are a primary obstacle to getting high-quality image
Image annotation is the process of assigning proper keywords to describe the content of a given image, which can be regarded as a problem of multi-object image classification. In this paper, a general multi-label annotation algorithm is proposed, which is based on sparse representation theory and employs a multi-level
In this paper, we present a scene interpretation framework for Synthetic Aperture Radar (SAR) images, using keywords of the image contents provided by users. The framework consists of incorporation of prior knowledge with SAR iMage Annotation Tool (SARMAT), representation of SAR images, and prediction of scene labels
knowledge from existing knowledge bank by extracting linguistic information such as part-of-speech and co-occurrence of keywords and constructing a new domain-adaptive transfer knowledge bank. Through experiments on homogeneous and heterogeneous feature spaces, we testify the efficacy of our methods.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.