The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Feature selection is a process which chooses a subset from the original feature set according to some rules. The selected feature retains original physical meaning and provides a better understanding for the data and learning process. However, few modern feature selection approaches take the advantage of features' context information. Based on this analysis, we propose a novel feature selection method...
Feature selection is a valid method to reduce the dimension of vector in text categorization system. After analyzed several common evaluation functions for feature selection, we applied terms weight function to feature selection. A new evaluation function based on improved TFIDF method is presented; in this function the category information is introduced to feature items, and the feature items of...
With the rapid growth in computer technology and popularization of Internet, e-mail has become one economical and convenient form of communication. But different types of crime and civil action involving e-mail documents appear which do harm to people's life and social's stabilization. So the criminal e-mail's authorship has to be identified automatically for the purpose of computer forensic. To solve...
Automatic text classifying is an import application of the information processing technology. This paper introduces the key techniques of Chinese text categorization such as text preprocessing, feature selection, feature representation, training and classifying algorithm, especially analyses the current most important several feature selection methods with emphasis. A Chinese text classifier based...
One of the most vital problems of free-text document processing is the curse of dimensionality. The paper presents a dimensionality reduction algorithm based on informed feature selection. Terms describing the document are based on histogram-like statistics which can be computed as well as incrementally updated at low complexity. The document representation can adapt to changing document collection...
Text categorization is continuing to be one of the most researched NLP problems due to the ever-increasing amounts of electronic documents and digital libraries. In this paper, we present a new text categorization method that combines the distributional clustering of words and a learning logic technique, called Lsquare, for constructing text classifiers. The high dimensionality of text in a document...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.