The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Feature selection aims to seek some relevant features from the whole feature space to construct a feature subset and it can help us to handle clustering, classification and retrieval. This paper considers feature selection for text categorization. We put forward a filter feature selection scheme based on class difference measure. The key idea of our proposed algorithm is difference between the frequencies...
A major difficulty of text categorization is extremely high dimensionality of text feature space. The use of feature selection techniques for large-scale text categorization task is desired for improving the accuracy and efficiency. χ2 statistic and simplified χ2 are two effective feature selection methods in text categorization. Using these two feature selection criteria, for a term, one needs to...
Matrix equation problem is one of the topics of active research in the context of computational mathematics. The Hermitian positive definite solutions of a matrix equation play an important role in real applications. In this paper, we present the sufficient conditions for the existence of the positive definite solution to the nonlinear matrix equation X -- A* 2m square root X-1 A = I and propose a...
Principal component analysis is a multivariate statistical method that makes the complex cross-correlation between the variables simpler. The basic idea of principal component analysis is to project the original observation data into a new low-dimensional space in the sense of information loss minimization and then to solve the problem with a significantly reduced size, but the classical principal...
Text categorization usually suffers from a huge-scale number of features. Most of those are irrelevant and noise which could mislead the classifier. In order to improve the efficiency and effectiveness for text categorization, feature selection is often performed. In this paper, a novel feature selection approach for dealing with text categorization, called Maximum Information Metric (MIM), is proposed...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.