The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We propose EC3, a novel algorithm that merges classification and clustering together in order to support both binary and multi-class classification. EC3 is based on a principled combination of multiple classification and multiple clustering methods using a convex optimization function. We additionally propose iEC3, a variant of EC3 that handles imbalanced training data. We perform an extensive experimental...
Mutual information clustering is an agglomerative hierarchical clustering method that has been used to group random variables or sets thereof. Some researchers have found that the normalization method used can lead to oddly-sized clusters that do not line up with expected results. We introduce a new normalization parameter to control the size of the clusters, and apply it to food allergy data from...
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces a hierarchical organization of clusters in a dataset w.r.t. a parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts, choosing a "good" value for it can be challenging: depending on the data distribution, a high or low value for mpts may be more appropriate, and certain data clusters may...
We suggested a method of clustering, which allows to build a model of conceptual clustering for objects of fuzzy nature, and also to increase the accuracy of clustering for such objects. We used Cobweb clustering method as a base. We modified the formula of assessing the utility of conceptual clustering for objects with fuzzy parameter values. Then we suggested a modified Cobweb version for working...
Glioblastoma multiforme (GBM) is the most fatal malignant type of brain tumor with a very poor prognosis with a median survival of around one year. Numerous studies have reported tumor subtypes that consider different characteristics on individual patients, which may play important roles in determining the survival rates in GBM. In this study, we present a pathway-based clustering method using Restricted...
Clustering is an important unsupervised data analysis technique, which divides data objects into clusters based on similarity. Clustering has been studied and applied in many different fields, including pattern recognition, data mining, decision science and statistics. Clustering algorithms can be mainly classified as hierarchical and partitional clustering approaches. Partitioning around medoids...
Missing data is a data mining problem that adversely affects data analysis and decision making processes that are frequently encountered in healthcare data for a variety of reasons. Missing data is still an important research topic because the success of the method is influenced by many factors such as the characteristics of the data and the type of the missing data. In this study, a clustering and...
Satellite telemetry data is the only basis for the experts to obtain the working status and the health status of the in-orbit satellite. The pattern mining and extraction of satellite telemetry data are of high significance for automatic judgment and anomaly detection. Clustering, as an important time series data mining method, can achieve automatic and intelligent analysis of satellite telemetry...
Cluster analysis aims at classifying data elements into different categories according to their similarity. It is a common task in data mining and useful in various field including pattern recognition, machine learning, information retrieval and so on. As an extensive studied area, many clustering methods are proposed in literature. Among them, some methods are focused on mining clusters with arbitrary...
Clustering is an important task in data mining area, especially in the area of continuous stream of data, i.e. ?data stream?. However, some characteristic of this kind of data is neglected during the existing clustering approaches. The similarity in temporal dimension between entities is underestimated. Forgetting mechanism is adopted to remove the old patterns to save computation resources. However,...
In this paper, we propose a persistent scatterer clustering method for high-resolution structure displacement analysis. Persistent scatterer interferometry can monitor millimetric displacement of structures like bridges, buildings, and roads by analysis at persistent scatterers (PSs), pixels with high coherence in synthetic aperture radar (SAR) images. However, it requires great time and effort to...
This paper introduces an approach to outlier mining in the context of rule-based knowledge bases. Rules in knowledge bases are a very specific type of data representation and it is necessary to analyze them carefully, especially when they differ from each other. The goal of the paper is to analyze the influence of using different similarity measures and clustering methods on the number of outliers...
In the past decade, with the world-wide initiative of upgrading the electrical grid to smart grid, a significant amount of data have been generated by the grid on a daily basis. Therefore, there has been an increasing need in handling and processing these data efficiently. In this paper, we present our experience in applying unsupervised clustering on PMU data for event characterization on the smart...
Smart Grid, a modern approach to electricity distribution, requires innovation on various fronts. Communication is a key component of Smart Grid applicability. To satisfy Quality of Service (QoS) needs when deciding on network structure and topology, especially in urban areas, artificial intelligence techniques may be applied. Techniques such as clustering methods or genetic algorithms are useful...
The Nearest Neighbor Classification (NNC) has been widely used as classification method, due to its simplicity, classification efficiency and its ability to deal with different classification problems. Despite its good classification accuracy, the NNC suffers from many shortcomings on the execution time, noise sensitivity, high storage requirements and lack of interpretability. In this paper, we propose...
Clustering algorithm is one of the fundamental techniques in data mining, which plays a crucial role in various applications, such as pattern recognition, document retrieval, and computer vision. As so far, many effective algorithms have been proposed. Affinity Propagation is an algorithm requires no parameter indicating the number of clusters, which is the most distinguishing advantage compared to...
This paper examines metric spaces in which the distance between any pair of nodes is given by an interval. The goal is to investigate methods for hierarchical clustering, i.e., a family of nested partitions indexed by a connectivity parameter, deduced from the underlying distance intervals of the metric spaces. Our construction is based on designing admissible methods abiding to the axioms of value...
Several unclassified web services are available in the internet which is difficult for the user to choose the correct web services. This raises service discovery cost, transforming data time between services and service searching time. Adequate methods, tools, technologies for clustering the web services have been developed. The clustering of web services is done manually. This survey is organized...
Conventional clustering algorithms based on the assumption that a data point can be assigned to only a single cluster. In spite of, there are several types of data that a data point belongs to multiple categories and causes ground-truth clusters overlap. To handle this situation, several algorithms are proposed and referred as “overlapping clustering”. One of state-of-the-art partition-based overlapping...
This paper describes a method for detection of object using 3D point cloud measurement in the sea environment. The method employs RBNN clustering method and using a 3D Lidar, mono-vision and stereo-vision cameras, and radar vision system. A radially based nearest neighbors (RBNN) clustering technique is adopted to perform object detection on 3D point cloud clustering. RBNN is constructing clusters...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.