The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a wide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., Peer-to-Peer (P2P) network...
With the rapid development of the network technique and the prevalence of the Internet, e-learning has become the major trend of the development of international education since 1980s, and the important access for the internationalization and the information of education. To meet the personalized needs of learners in e-learning, a new Web text clustering method for personalized e-learning based on...
How to reduce the number of frequent itemsets effectively is a hot topic in data mining research. Clustering frequent itemsets is one solution to the problem. Since generators are lossless concise representations of all frequent itemsets, clustering generators is equivalent to clustering all frequent itemsets. This paper proposes a new algorithm for clustering frequent itemsets based on generators...
As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories:...
This paper proposes a self-organized genetic algorithm for document clustering based on semantic similarity measure. The traditional method to represent text is that the document is organized as a string of words, while the conceptual similarity is ignored. We take advantage of thesaurus-based ontology to overcome this problem. To investigate how ontology method could be used effectively in document...
A new algorithm of Web text clustering mining is presented, which is based on the Discovery Feature Sub-space Model (DFSSM). This algorithm includes the training stage of SOM and the clustering stage, which characterizes self-stability and powerful antinoise ability. It can distinguishes the most meaningful features from the Concept Space without the evaluation function. we have applied the algorithm...
In this paper we propose a document representation model based on latent semantic analysis (LSA) for text clustering. Most classic clustering systems represent document with a set of indices, which have been known as vector space model (VSM). In such a model, documents are encoded as vectors in N-dimensional space, where N is the number of unique terms. However, this method causes that the scalability...
In this paper, we propose a method of genetic algorithm (GA) for text clustering based on singular value decomposition technique. The main difficulty in the application of GA to text clustering is its long string representation in high dimensional space. Because the most straightforward and popular approach represents texts with vector space model (VSM), that is, each unique term in the vocabulary...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.