The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The demand of text classification is growing significantly in web searching, data mining, web ranking, recommendation systems and so many other fields of information and technology. This paper illustrates the text classification process on different dataset using some standard supervised machine learning techniques. Text documents can be classified through various kinds of classifiers. Labeled text...
Traditional manual design of analytical processes is challenging as it requires a general analyst to have good grasping of numerous algorithms and the interaction effects between each technique and the data across multiple domains. Especially in an increasingly high data variety/multi-domain environment today, this design process can be very laborious/challenging. In this paper, we describe a design...
In recent years, text classification have been widely used. Dimension of text data has increased more and more. Working of almost all classification algorithms is directly related to dimension. In high dimension data set, working of classification algorithms both takes time and occurs over fitting problem. So feature selection is crucial for machine learning techniques. In this study, frequently used...
The paper is devoted to the issues of automated categorization of textual information which can be applied in the systems intended to block inappropriate content. The approach used for feature selection and construction is proposed. The text mining methods used for research (Decision Tree classifiers) are analyzed. Besides that, the techniques of Web sites analysis that provide information in different...
In recent years, the research on text classification algorithm is still a hot topic in text mining. The KNN is a classic text classification algorithm. The rule of finding the nearest neighbors directly affects the performance and precision of categorization. In this paper, we mainly focus on distance measure and similarity. We propose a new text classification algorithm which combines KNN and Choquet...
We propose a novel algorithm, QuIET, for binary classification of texts. The method automatically generates a set of span queries from a set of annotated documents and uses the query set to categorize unlabeled texts. QuIET generates models that are human understandable. We describe the method and evaluate it empirically against Support Vector Machines, demonstrating a comparable performance for a...
Multi-Instance Learning (MIL) is a special scheme in machine learning. In recent research it is successfully applied in text classification problem. However, MIL is naturally semi-supervised since the instances labels are unknown for positive bags, which would cut down the accuracy of predictors, or require more computational cost to reduce uncertainty, or to guess such labels at a high probability...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.