The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Supervised learning is a popular approach to text classification among the research community as well as within software development industry. It enables intelligent systems to solve various text analysis problems such as document organization, spam detection and report scoring. However, the extremely difficult and time intensive process of creating a training corpus makes it inapplicable to many...
related keywords as representative vectors for different sentiments, we use these vectors as the sentiment classifier for the testing set. We achieved results that are not only comparable to traditional methods like Naïve Bayes and SVM, but also outperform Latent Dirichlet Allocation, TF-IDF and its variant. It also
Deep learning had a significant impact on diverse pattern recognition tasks in the recent past. In this paper, we investigate its potential for keyword spotting in handwritten documents by designing a novel feature extraction system based on Convolutional Deep Belief Networks. Sliding window features are learned from
topic analysis of LDA for feature selection and compare it with the classical feature selection metrics in text categorization. For the experiments, we use SVM as the classifier and tf*idf weighting for weighting the terms. We observed that almost in all metrics, information gain performs best at all keyword numbers while
, pattern recognition) to detect such critical documents. To address difficult or ambiguous instances, we supplement the text classifier with an automated keyword search. That is, we extract, in an automated fashion, discriminative terms (i.e., keywords) from the training set and match them against documents during the
loss function during training is that it aims at maximizing not only the relative ranking scores, but also adjusts the system to use a fixed threshold and thus maximizes the detection accuracy rates. We use the new loss function in the structured prediction setting and extend the discriminative keyword spotting algorithm
Automatic image annotation is crucial for keyword-based image retrieval. There is a trend focusing on utilization of machine learning techniques, which learn statistical models from annotated images and apply them to generate annotations for unseen images. In this paper we propose MAGMA - new image auto-annotation
mechanisms with a traditional indexing method. The goal is to identify a higher semantic content and more meaningful keyword combinations, considering both supervised and unsupervised techniques. Within a specific implementation both Bayesian learning as well as clustering are integrated to support a boost parameter towards
part of a trending discussion topic by the presence of a tagged keyword. Relying solely on this keyword, however, may be inadequate for identifying all the discussion associated with a trend. Our research demonstrates that machine learning techniques can be used identify the top trend a tweet belongs to with up to 85
Image classification is an important research topic due to its potential impact on both image processing and understanding. However, due to the inherent ambiguity of image-keyword mapping, this task becomes a challenge. From the perspective of machine learning, image classification task fits the multi-instance
Automatic image annotation is a promising solution to enable more effective image retrieval by keywords. Different statistical models and machine learning methods have been introduced for image auto-annotation. In this paper, we propose a collaborative approach, in which multiple different statistical models are
, the improved model is capable of discovering the correlation between blobs (segmented regions) and textual keywords so as to automatically generate keywords for un-annotated image according to joint probabilities. Moreover, it has the ability to detect and remove false keyword(s) by considering the co-occurrence of
In this paper we present a spoken query detection method based on posteriorgrams generated from Deep Boltzmann Machines (DBMs). The proposed method can be deployed in both semi-supervised and unsupervised training scenarios. The DBM-based posteriorgrams were evaluated on a series of keyword spotting tasks using the
. Thus, it is of great significance for enterprises to find reasonable solutions automatically. Combined with keyword tokenization, data mining, numerical optimization and neural network, this paper presents a system that compares and finds the most similar incident solution in the past, based on the description provided by
With the development of the World Wide Web, there exists more and more illicit drug Webpages. Thus, how to screen cannabis Webpages on the internet is a quite important issue. Conventional methods that only use the keyword-based or image-based approaches are not sufficient. We propose a Multi-Modal Multiple-Instance
In this paper, a new method for question classification is proposed, which employs ensemble learning algorithms to train multiple question classifiers. These component learners are combined to produce the final hypothesis. In detail, the feature spaces are obtained through extracting high-frequency keywords from
Most web search engines use only the search keywords for searching. Due to the ambiguity of semantics and usages of the search keywords, the results are noisy and many of them do not match the user's search goals. This paper presents the design of an intelligent Search Bot, which operates as an agent for a user by
of content. The main contribution of FIRSt is an integrated strategy that enables a content-based recommender to infer user interests by applying machine learning techniques, both on official item descriptions provided by a publisher and on freely keywords which users adopt to annotate relevant items. Static content and
Annotating documents with keywords or ‘tags’ is useful for categorizing documents and helping users find a document efficiently and quickly. Question and answer (Q&A) sites also use tags to categorize questions to help ensure that their users are aware of questions related to their areas of expertise
Automatic image annotation is an important but highly challenging problem in semantic-based image retrieval. In this paper, we formulate image annotation as a supervised learning image classification problem under region-based image annotation framework. In region-based image annotation, keywords are usually
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.