Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
The traditional k-means algorithm has sensitivity to the initial start center. To solve this problem, this paper proposed a new method to find the initial center and improve the sensitivity to the initial centers of k-means algorithm. The algorithm first computes the density of the area where the data object belongs to; then it finds k data objects, which are belong to high density area, as the initial...
Although fuzzy k-modes algorithm has removed the numeric-only limitation of the k-means algorithm, that each attribute of the centroid with a single category value and the use of a simple distance measure will compromise its precision, and therefore prone to falling into local optima. In this paper, an extended fuzzy k-means(xFKM) algorithm for clustering categorical valued data is presented, in which...
Cognitive maps, one of the hot topic in the research of computational intelligence, have been widely used in knowledge representation and decision-making. In mining of cognitive maps on the basis of data resources, outlier data seriously affect the accuracy of cognitive maps. Therefore, this paper, based on the analysis of traditional ones, proposes a new outlier data detection algorithm. The algorithm...
Data clustering is one of the powerful techniques for the knowledge discovery from data. In this paper, a novel approach for hierarchical clustering has been proposed over non-binary search space. Besides the agglomerative methods, the proposed algorithm has considered the Strength of Presence associated with each transaction, to yield quality clusters which are again more close to the real life situation...
Document clustering as an unsupervised approach extensively used to navigate, filter, summarize and manage large collection of document repositories like the World Wide Web (WWW). Recently, focuses in this domain shifted from traditional vector based document similarity for clustering to suffix tree based document similarity, as it offers more semantic representation of the text present in the document...
Clustering analysis method is one of the main analytical methods in data mining, the method of clustering algorithm will influence the clustering results directly. This paper discusses the standard k-means clustering algorithm and analyzes the shortcomings of standard k-means algorithm, such as the k-means clustering algorithm has to calculate the distance between each data object and all cluster...
More and more data in practice is changing every minute and been collected in incremental mode, and incremental clustering has attracted much of researchers' attention. However, little research now focuses on partitioning categorical data in incremental mode. How to design incremental clustering for categorical data is an urgent problem. We propose an incremental clustering for categorical data using...
To be effective to retain customers and enhance the marketing capabilities, it is necessary to improve the personalization of e-commerce systems. Clustering is a reliable and efficient technology to provide personal service in e-commerce system. However, current research on clustering algorithm usually based on numeric data or categorical data. To analysis customer behavior, mixed data set must be...
Sorting and clustering methods inspired by the behavior of real ants are among the earliest methods in ant-based meta-heuristics. We revisit these methods in the context of a concrete application and introduce some modifications that yield significant improvements in terms of both quality and efficiency. In this paper, we propose an Improved entropy-based ant clustering (IEAC) algorithm. Firstly,...
At first, some improvements were done in a single ant colony clustering algorithm, then, for different speed ant colony, clustering analysis was finished independently and in parallel by imitating the collaborative performance of multi-colony, and clustering results were combined into a hyper-graph and second division was made in the hyper-graph using ACA, at last, the test result for four databases...
Clusters in protein interaction networks can potentially help identify functional relationships among proteins. The clustering problem can be modeled as a graph cut problem. Given an edge weighted graph the problem is to partition the vertices of the graph into k partitions of prescribed sizes such that the total weight of the edges within partitions are maximized. This problem is NP-complete for...
Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Some of the important applications in the field of data mining are fraud detection, customer behavior analysis, and intrusion detection. There are number of good research algorithms for detecting outliers if the entire data is available and algorithms can...
Trajectory clustering is attractive for the task of class identification in spatial database. Existing trajectory clustering algorithm TRCLUS uses global parameters to discover common trajectories. However, it can not discover small and dense clusters and be sensitive to two input parameters. Based on the partition-and-group framework, we propose a simple but effective trajectory clustering algorithm...
Clustering is an important task in data mining with numerous applications, including minefield detection, seismology, astronomy, etc. At present, the academic communities have introduced various clustering algorithms, and these methods have been widely applied to different fields according to their respective characteristics. In this paper, we propose a novel clustering algorithm based on symmetric...
Cluster analysis is a primary method for database mining. Most of clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-datasets there does not exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce...
With the rapid development of the Internet and communication technology, huge data is accumulated. Short text such as conversation in chatting room and email is common in such data. It is useful to cluster such short documents to get the structure of the data or to help building other data mining applications. But most of the current clustering algorithms can not get acceptable clustering accuracy...
In this work we present a novel method to model instance-level constraints within a clustering algorithm. Thereby, both similarity and dissimilarity constraints can be used coevally. The proposed extension is based on a distance transformation by shortest path computations in a constraint graph. With a new technique cannot-links are consistently supported and the dissimilarity is extended to their...
Many enterprises incorporate information gathered from a variety of data sources into an integrated input for some learning task. For example, aiming towards the design of an automated diagnostic tool for some diseases, one may wish to integrate data gathered from many different hospitals. Analyzing and mining these distributed heterogeneous data sources require distributed machine learning and data...
DBSCAN is one of the most popular algorithms for cluster analysis. It can discover all clusters with arbitrary shape and separate noises. But this algorithm canpsilat choose parameter according to distributing of dataset. It simply uses the global MinPts parameter, so that the clustering result of multi-density database is inaccurate. In addition, when it is used to cluster large databases, it will...
In traditional FCM clustering algorithm each feature is supposed to have equal importance. Considering different feature with different importance, this paper presented an improved FCM algorithm with adaptive weight for features of each cluster, named AWFCM. In the iterative AWFCM process, to identify the importance of features of each cluster, the weight for feature is computed dynamically based...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.