The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As an important preprocessing technology in patent knowledge utilization, patent classification should be accurate and efficient. Commonly used feature selection methods and classification algorithms, like information gain (IG) and k nearest neighbors (k-NN) algorithm, are superior in text classification but have some drawbacks in patent classification. In the paper, we focus on patent classification...
In the past decade the massive growth of the Internet brought huge changes in the way humans live their daily life; however, the biggest concern with rapid growth of digital information is how to efficiently manage and filter unwanted data. In this paper, we propose a method for managing RSS feeds from various news websites. A Web service was developed to provide filtered news items extracted from...
In a variety of text classification algorithm, KNN is a competitive one with simple implementation and high efficiency. However, with the expansion of the size of the text, the runtime of KNN will grow rapidly that cannot be afford. In this paper, we improve the KNN by introducing the kd-tree storage structure and reducing the sample space through the sample clustering methods. And experiment shows...
We introduce a new method for dimensionality reduction by attribute extraction and evaluate its impact on text classification. The textual contents in body sections of the news in Reuters-21758 are the selected attributes for classification. Using the offered method, high dimension of attributes- words extracted from the news bodies- are projected onto a new hyper plane having dimensions equal to...
In this paper, a general decision layer classification fusion model, based on information fusion for improving classification precision, is proposed, that is, different multi-classification algorithms as the feature layer doing respective classification, and the results of classification algorithms are input into decision level, the last classification result is output.This model is applied into improving...
We propose a feature called category browsing to enhance the full-text search function of Thai-language news article search engine. The category browsing allows users to browse and filter search results based on some predefined categories. To implement the category browsing feature, we applied and compared among several text categorization algorithms including decision tree, Naive Bayes (NB) and Support...
Todaypsilas signature-based anti-viruses are very accurate, but are limited in detecting new malicious code. Currently, dozens of new malicious codes are created every day, and this number is expected to increase in the coming years. Recently, classification algorithms were used successfully for the detection of unknown malicious code. These studies used a test collection with a limited size where...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.