The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic text classification is the key technology to process and organize large-scale text data. It is well known that the high dimensionality of feature space is a main challenge for text classification. In order to attenuate such a problem as well as inspired by existing arts, we propose an effective text feature selection algorithm by novelly fusing the classical methodologies of Gini index and...
Text feature selection is the key technology in text classification and text information retrieval. The feature selection method - information gain - has extensive application in text categorization. This paper theoretically analyzed the deficiency of information gain in feature selection methods, and then introduced two improvement factors which were LDFWF (Limiting Document Frequency's Word Frequency)...
The World Wide Web serves as a huge repository of information that is highly dynamic, diverse and growing at an exponential rate in a lightening speed. In order to speed-up and further improve tasks like information search and retrieval, personalization etc; it is highly important to develop techniques to classify text documents more accurately and efficiently than before. This paper is an effort...
This paper compares the performance of linear and nonlinear kernels of Support Vector Machines (SVM) used for text classification. The study is motivated by the previous viewpoint that linear SVM performs better than nonlinear one, and that, although there are many investigations have proved that SVM performs well in text classification, there is no serious investigation on the comparison between...
Style-based text authorship identification extracts features from authorship-known texts, constructs classifier and then identifies disputed texts. Authorship identification belongs to the domain of style classification and is a branch of text classification. In contrast with text classification which deals with the content of texts, authorship identification focuses on the form property of texts...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.