The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Crime against women in India has become an eminent topic of discussion in recent years and the issue has been brought to the foreground for concern due to the increasing trends in crimes performed against women. Most of the crimes get reported and a massive dataset is being generated every year. Analysing the crime reports can help the law enforcement section to take preventive measures for reducing...
In this knowledge era, plethora of textual information is growing rapidly which is usually semistructured or unstructured data collected and stored in various databases. Discovery of knowledge from this available database is not simple. Thus, the automatic feature selection approach is very much necessary in the processing of this unstructured data. The Feature selection approach focuses towards processing...
The rapid increase in the number of text documents available on the Internet has created pressure to use effective cleaning techniques. Cleaning techniques are needed for converting these documents to structured documents. Text cleaning techniques are one of the key mechanisms in typical text mining application frameworks. In this paper, we explore the role of text cleaning in the 20 newsgroups dataset,...
Opinion mining is a challenging task to identify the opinions or sentiments underlying user generated contents, such as online product reviews, blogs, discussion forums, etc. Previous studies that adopt machine learning algorithms mainly focus on designing effective features for this complex task. This paper presents our approach based on tree kernels for opinion mining of online product reviews....
Feature selection is an important preprocessing step of Chinese Text Categorization, which reduces the high dimension and keeps the reduced results comprehensible compared to feature extraction. A novel criterion to filter features coarsely is proposed, which integrating the superiorities of term frequency-inverse document frequency as inner-class measure and CHI-square as inter-class, and a new feature...
This paper presents the results of classifying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approach of Document Frequency Thresholding using an embedded information gain criterion of the decision tree algorithm is the preferable feature selection criterion. The study concluded that...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.