The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We propose EC3, a novel algorithm that merges classification and clustering together in order to support both binary and multi-class classification. EC3 is based on a principled combination of multiple classification and multiple clustering methods using a convex optimization function. We additionally propose iEC3, a variant of EC3 that handles imbalanced training data. We perform an extensive experimental...
Clustering is an effective method for data analysis and can be exploited to unknown features of data samples, its applications range from data mining to bioinformatics analysis. Several clustering approaches have been proposed in order to obtain a better trade-off between accuracy and efficiency of the clustering process. It is well-known that no existing clustering algorithm completely satisfies...
This survey highlights issues in clustering which hinder in achieving optimal solution or generates inconsistent outputs. We called such malignancies as dark patches. We focus on the issues relating to clustering rather than concepts and techniques of clustering. For better insight into the issues of clustering, we categorize dark patches into three classes and then compare various clustering methods...
Clustering analysis is an active research branch in the area of data mining due to its simplicity and rapidity. However, K-means algorithm has the shortcomings of heavily depending on the initial clustering center and easily falls into local optimum. In this paper, we consider a deep research on K-means algorithm of optimization. We put forward the first selected initial clustering center of K-means...
In this paper, we propose a novel Latent Multi-view Subspace Clustering (LMSC) method, which clusters data points with latent representation and simultaneously explores underlying complementary information from multiple views. Unlike most existing single view subspace clustering methods that reconstruct data points using original features, our method seeks the underlying latent representation and...
This paper presents a new differential evolution algorithm for multimodal optimization that uses self-adaptive parameter control, clustering and crowding methods. The algorithm includes a new clustering mechanism that is based on small subpopulations with the best strategy and, as such, improves the algorithm's efficiency. Each subpopulation is generated according to the best individual from a population...
In the bases of increasing the volume of text information, the dealing with text information has become incredibly complicated. The text clustering is a suitable technique used in dealing with a tremendous amount of text documents by classifying these set of text documents into clusters. Ultimately, text documents hold sparse, non-uniform distribution and uninformative features are difficult to cluster...
Subspace clustering methods based on spectral clustering have been very popular due to their theoretical guarantees and empirical success. However, considering the constraint information of data, these subspace-clustering-based constraint clustering algorithms are difficult for the high-dimensional data with data nuisances to achieve better clustering results. This paper proposes a novel constraint...
The fuzzy c-means method is investigated to cluster the heavy tailed data by using some measures of distance. A comparison study is provided based on time and precision. The results show that when using the Euclidean distance, the time required is less than if we used Manhattan distance, but the precision is higher when using the Manhattan distance.
A novel framework to optimize the identification clustering of multipath scatterers in a MIMO wireless system is proposed. It is a comprehensive evaluation of major cluster identification methods across multiple categories of clustering methodologies. The reliability will be ensured with the use of a parameter selection framework utilizing the Bayesian Information Criterion (BIC). Statistical preprocessing...
Real-world datasets consist of data representations (views) from different sources which often provide information complementary to each other. Multi-view learning algorithms aim at exploiting the complementary information present in different views for clustering and classification tasks. Several multi-view clustering methods that aim at partitioning objects into clusters based on multiple representations...
Clustering is an unsupervised learning approach that explores data and seeks groups of similar objects. Many classical clustering models such as k-means and DBSCAN are based on heuristics algorithms and suffer from local optimal solutions and numerical instability. Recently convex clustering has received increasing attentions, which leverages the sparsity inducing norms and enjoys many attractive...
By exploring alternative approaches to combinatorial optimization, we propose the first known formal connection between clustering and set partitioning, with the goal of identifying a subclass of set partitioning problems that can be solved efficiently and with optimality guarantees through a clustering approach. We prove the equivalence between classical centroid clustering problems and a special...
In this paper, we introduce an innovative fuzzy clustering model that includes some prior knowledge about the data. The prior knowledge is the data correlations expressed in a form of graph. Specifically, in this new model, we add a graph regularization term to the objective function of Fuzzy C-Mean (FCM) to fine-tune the final clustering result. By doing so, when we conduct fuzzy clustering to classify...
Ultrasonic inspection technique has been widely used in defect detection of carbon fiber reinforced polymer (CFRP) materials. Although a variety of signal processing methods have been applied to highlight the defect features contained in ultrasonic signals, the most usual way to identify the defective regions is still not automatic, which is not only time-consuming but also critically dependent upon...
We describe a novel clustering technique for clustering short texts, such as URLs, without enriching it with the help of external knowledge sources. Our technique first performs feature clustering to identify the key features of the dataset and then reconstructs the dataset on the basis of the key features. Then, it computes the similarity of the short texts belonging to the reconstructed dataset...
Clustering is among the most common data mining techniques and Fuzzy clustering can model the world even more realistically and more precisely. One of the most favorable fuzzy clustering methods is the Fuzzy C-Means (FCM) algorithm, which is actually identical to the (original) K-Means clustering algorithm fueled with a fuzzy flavor. However, there are some issues with the fuzzy clustering methods;...
Genetic Algorithm (GA) is an effective method for solving Traveling Salesman Problems (TSPs), nevertheless, the Classical Genetic Algorithm (CGA) performs poor effect for large-scale traveling salesman problems. For conquering the problem, this paper presents two improved genetic algorithms based on clustering to find the best results of TSPs. The main process is clustering, intra-group evolution...
At present, electric power industry is in a period of rapid development and the assessment of regional power development captures growing attention. Through the evaluation process, we can deepen the understandings of regional development ability and puts forward suggestions for the further development. However, the understanding of regional power development is still at the perceptual level now and...
Overlapping Clustering is an important technique in machine learning which aims to organize data into a set of non-disjoint groups rather than the disjoint one which is the case of conventional clustering methods. Several machine learning applications require that data object be assigned to one or several groups resulting in non-disjoint partitioning of data such as document clustering where each...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.