The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The source of retrieval about stage's design knowledge base is text script, and text processing has become the key technologies about obtaining related information from script. This paper proposes a method of extracting from the script in the keyword categories by analyzing the characteristics of the script
As the amount of data increases and the relations among them get more complex, access to information implicit in data appears more difficult, and the role of methods of getting data from diverse texts, and analyzing them becomes more significant. Of such methods is the highly effective technique of keyword extraction
This paper presents a corpus-based approach for extracting keywords from a text written in a language that has no word boundary. Based on the concept of Thai character cluster, a Thai running text is preliminarily segmented into a sequence of inseparable units, called TCCs. To enable the handling of a large-scaled
This paper presents a new keyword extraction algorithm for Chinese news Web pages using lexical chains and word co-occurrence combined with frequency features, cohesion features, and corelation features. A lexical chain is an external performance consistency by semantically related words of a text, and is the
paper, we propose the automatic keyword extraction system and Thai website categorization system which can automatically update the dictionary and categorize website in Thai. The dictionary is a collection of vector which is created from the automatic keyword extraction system. The result in term of accuracy shows that our
Online advertising has now turned to be one of the major revenue sources for today's Internet companies. Among the different channels of advertising, contextual advertising takes the great part. There are already lots of studies done for the keyword extraction problem in contextual advertising for English, however
The search engine, keyword extraction is an important technique. In this paper, aiming at the defects of the traditional keyword extraction algorithm, we proposed an improved weight computation strategy. The experimental results show that, the improved method's results are significantly better results than the
This paper presents an audio keywords detection method for highlight retrieval in basketball video. The keywords contain shoes squeaking sound, speech, cheer, long whistle and short whistle, which correspond to basketball game events. After feature analysis, the Simple Excellent Feature Combination based on Pearson
We propose a feature word selection method for classifying recommended shops using Yelp customer reviews. TextRank keywords are extracted from the customer reviews to construct the sorted positive and negative keyword lists based on each keyword's summed TextRank scores. The top-K keywords are then aggregated
Addressing the problem of spam emails in the Internet, this paper presents a comparative study on Nai??ve Bayes and Artificial Neural Networks (ANN) based modeling of spammer behavior. Keyword-based spam email filtering techniques fall short to model spammer behavior as the spammer constantly changes tactics to
email content only to build keyword corpus, together with some text processing to handle obfuscation technique. The algorithm was evaluated using the CSDMC2010 SPAM corpus dataset that contained 4327 emails in the training dataset and 4292 emails in the testing dataset. The experimental results show that the proposed
In this paper we propose an approach for Chinese question analysis and answer extraction. A general question analysis process contains keyword extraction and question classification. Question classification plays a crucial role in automatic question answering. To implement the question classification, we have carried
In this paper, we describe the use of a Boosting algorithm, Real AdaBoost, for content-based image retrieval (CBIR) on a large number (190) of keyword categories. Previous work with Boosting for image orientation detection has involved only a few categories, such as a simple outdoor vs. indoor scene dichotomy. Other
To exploit co-occurrence patterns among features and target semantics while keeping the simplicity of the keyword-based visual search, a novel reranking methods is proposed. The approach, ordinal reranking, reranks an initial search list by utilizing the co-occurrence patterns via the ranking functions such as ListNet
This paper presents a novel framework for multi-folder email classification using graph mining as the underlying technique. Although several techniques exist (e.g., SVM, TF-IDF, n-gram) for addressing this problem in a delimited context, they heavily rely on extracting high-frequency keywords, thus ignoring the
of cultural information. Therefore, text categorization research has become more important. The paper improved the precision of the traditional text categorization by the process that we mended the weight of words and mined potential keywords, then found their relationship. In the end of the paper, an experiment was
of HTML page, and the proposed algorithms is performed. Complete evaluation is performed which indicates the effectiveness of using our technique. The experimental results show improved precision and recall with the proposed algorithms with respect to keyword-based search. The algorithms are implemented in JAVA and its
title, keyword and link text information to represent the website. Heterogeneous classifiers are then built based on these different features. We propose a principled ensemble classification algorithm to combine the predicted results from different phishing detection classifiers. Hierarchical clustering technique has been
This paper presents a novel method to extract Protein-Protein Interaction (PPI) information from biomedical literatures based on Support Vector Machine (SVM) and K Nearest Neighbors (KNN). The two protein names, words between two proteins, words surrounding two proteins, keyword between or among the surrounding words
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.