The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Synchrophasors are the state-of-the-art measuring sensors that sense voltage, current, or frequency with high data rate. This paper presents an approach to analyze the streaming smart-grid data generated by synchrophasors. A novel unit-circle representation is used to visualize the real-time phasor data. A Density based clustering (DBSCAN) method is proposed to cluster the phasor data to detect bad-data...
Clustering is a technique in which a given data set is divided into groups called clusters in such a manner that the data points that are similar lie together in one cluster. Clustering plays an important role in the field of data mining due to the large amount of data sets. This paper reviews the various clustering algorithms available for data mining and provides a comparative analysis of the various...
In this paper, we present our work on analyzing data sets that contain a large amount of data points. We study similarity search problems that find data points closest to a given query point. We also study cluster analysis that detects subgroups of data points from a data set that are similar to each other within the same subgroup. In this paper we design an algorithm to detect the clusters in subspaces...
This paper presents a new sequential clustering algorithm based on sequential hard c-means clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The sequential hard c-means is one of the typical and conventional sequential clustering methods. The proposed new sequential clustering algorithm is based on Dave's noise clustering approach. A characteristic...
Recent studies have suggested significant differences in motor performances of Parkinson's Disease (PD) patients who have L-dopa induced dyskinesias (LIDs), even when off of L-dopa medication. The pathophysiology of LIDs remains obscure, so applying data-mining techniques to the patients' motor performance may provide some heuristic insight. This paper investigated visually-guided tracking performance...
Clustering algorithms based on Grid are attractive for the task of data partition in spatial database. In the background of big data more and more research focuses on how to solve the conflict between efficiency and accuracy of clustering. Existing Grid-based clustering algorithms generally have a high time efficiency without considering the distribution of the data inside a grid. In this paper, a...
In this paper we propose a noise detection system based on similarities between instances. Having a data set with instances that belongs to multiple classes, a noise instance denotes a wrongly classified record. The similarity between different labeled instances is determined computing distances between them using several metrics among the standard ones. In order to ensure that this approach is computational...
Data mining is the process of extracting knowledge from the huge amount of data. The data can be stored in databases and information repositories. Data mining task can be divided into two models descriptive and predictive model. In Predictive model we can predict the values from different set of sample data, they are classified into three types such as classification, regression and time series. Descriptive...
Spatial clustering is one of the main methods of data mining and knowledge discovery. DBSCAN algorithm can be found in space with "noise" database clustering of arbitrary shape, is a kind of good clustering algorithm. This paper introduces the basic concept and principle of DBSCAN algorithm, and applies this algorithm to perform clustering analysis distributions of weibo location information...
This investigation develops a new data clustering technique. It is a new density-based clustering scheme by diagonal sampling and a new method of fold and rotation for enhancing data clustering performance. The proposed algorithm's expansion without selecting data points to increase computation cost and it may considerably lower time cost The experimental results confirm that the presented approach...
Data Mining is all about data analysis techniques. It is useful for extracting hidden and interesting patterns from large datasets. Clustering techniques are important when it comes to extracting knowledge from large amount of spatial data collected from various applications including GIS, satellite images, X-ray crystallography, remote sensing and environmental assessment and planning etc. To extract...
Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an...
Nowadays, organizations are facing several challenges when they try to analyze generated data with the aim of extracting useful information. This analytical capacity needs to be enhanced with tools capable of dealing with big data sets without making the analytical process a difficult task. Clustering is usually used, as this technique does not require any prior knowledge about the data. However,...
Many real time applications, they are generated continues flow of data streams have became more popular now a days. Therefore many researches attracted clustering data streams. Most of data stream clustering algorithms based on distance function which find out clusters with spiracle of shape clusters and unable to deal noisy data. Therefore density based clustering algorithms substitute remarkable...
Spatio-temporal clustering is a sub field of data mining that is increasingly gaining more scientific attention due to the advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. This process pretends to group objects based in their spatial and temporal similarity helping to discover interesting patterns and correlations in large...
Density-based clustering can detect arbitrary shape clusters, handle outliers and do not need the number of clusters in advance. However, they cannot work properly in multi density environments. The existing multi density clustering algorithms have some problems in order to be applicable for data streams such as the need of whole data to perform clustering, two-pass clustering and high execution time...
Clustering is an important tool which has seen an explosive growth in Machine Learning Algorithms. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithm is one of the most primary methods for clustering in data mining. DBSCAN has ability to find the clusters of variable sizes and shapes and it will also detect the noise. The two important parameters Epsilon (Eps)...
The analysis of high dimensional data comes with many intrinsic challenges. In particular, cluster structures become increasingly hard to detect when the data includes dimensions irrelevant to the individual clusters. With increasing dimensionality, distances between pairs of objects become very similar, and hence, meaningless for knowledge discovery. In this paper we propose Cartification, a new...
Data mining techniques are very popular in modern days and are used in NLP (Natural Language Processing). It allows users to analyze data from many different perspectives, categorize it, and summarize the relationships identified. One of the techniques, clustering items to groups, has been very popular. We use this technique here to find different topics in a document. We aim to replicate previous...
Millions of geo-tagged photos are becoming available due to the widespread of photo-sharing websites. These social medias capture attractive points-of-interest and contain interesting photo-taking patterns. Massive amount of these user-oriented data produces new challenges and understanding people's photo-taking behavior is of great importance for local tourism-related businesses. This paper analyzes...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.