The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Experiments are carried out on datasets with different dimensions selected from UCI datasets by using two classical clustering algorithms. The results of the experiments indicate that when the dimensionality of the real dataset is less than or equal to 30, the clustering algorithms based on distance are effective. For high-dimensional datasets--dimensionality is greater than 30, the clustering algorithms...
Simplified Silhouette Filter (SSF) is a recently introduced feature selection method that automatically estimates the number of features to be selected. To do so, a sampling strategy is combined with a clustering algorithm that seeks clusters of correlated (potentially redundant) features. It is well known that the choice of a similarity measure may have great impact in clustering results. As a consequence,...
Processing applications with a large number of dimensions has been a challenge to the data mining community. Feature selection is an effective dimensionality reduction technique. However, there are only a few methods proposed for feature selection for clustering. In this paper, a new feature selection algorithm for unsupervised learning is introduced. It is based on the assumption that, in absence...
In this work we present a novel method to model instance-level constraints within a clustering algorithm. Thereby, both similarity and dissimilarity constraints can be used coevally. The proposed extension is based on a distance transformation by shortest path computations in a constraint graph. With a new technique cannot-links are consistently supported and the dissimilarity is extended to their...
Although there exist a lot of cluster ensemble approaches, few of them consider the prior knowledge of the datasets. In this paper, we propose a new cluster ensemble approach called knowledge based cluster ensemble (KCE) which incorporates the prior knowledge of the dataset into the cluster ensemble framework. Specifically, the prior knowledge of the dataset is first represented by the side information...
In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.