The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages...
Text clustering is a useful and inexpensive way to organize vast text repositories into meaningful topics categories. Although text clustering can be seen as an alternative to supervised text categorization, the question remains of how to determine if the resulting clusters are of sufficient quality in a real-life application. However, it is difficult to evaluate a given clustering of documents. Furthermore,...
Document Clustering is a widely studied problem in Text Categorization. It is the process of partitioning or grouping a given set of documents into disjoint clusters where documents in the same cluster are similar. K-means, one of the simplest unsupervised learning algorithms, solves the well known clustering problem following a simple and easy way to classify a given data set through a certain number...
Wide availability of electronic data has led to the vast interest in text analysis, information retrieval and text categorization methods. To provide a better service, there is a need for non-English based document analysis and categorizing systems, as is currently available for English text documents. This study is mainly focused on categorizing Indic language documents. The main techniques examined...
Data mining (DM) brings knowledge and theories from several fields including databases, machine learning, optimization, statistics, and data visualization and has been applied to various real-life applications. A large amount of data mining articles have been published. The goal of this study is to establish an overview of the past and current data mining research activities from the title and abstract...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.