Search results

chapter

Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures

John Matta, Tayo Obafemi-Ajayi, Jeffrey Borwey, Donald Wunsch, more

2016 IEEE 16th International Conference on Data Mining (ICDM) > 320 - 329

2016 IEEE 16th International Conference on Data Mining (ICDM)

This paper examines a schema for graph-theoretic clustering using node-based resilience measures. Node-based resilience measures optimize an objective based on a critical set of nodes whose removal causes some severity of disconnection in the network. Beyond presenting a general framework for the usage of node based resilience measures for variations of clustering problems, we emphasize the unique...

chapter

Operational analysis of k-medoids and k-means algorithms on noisy data

Wellington Simbarashe Manjoro, Mradul Dhakar, Brijesh Kumar Chaurasia

2016 International Conference on Communication and Signal Processing (ICCSP) > 1500 - 1505

2016 International Conference on Communication and Signal Processing (ICCSP)

Clustering is applied to many applications and the decision with regards to which algorithm to use is dependent on the nature of the task to be carried out. Before choosing which clustering algorithm to use one needs to be aware of the nature of the task to be done and then determine the algorithm accordingly, based on the capabilities and performance metrics of that algorithm. This paper makes an...

chapter

Multi-source data clustering

Tiancheng Li, Juan M. Corchado, Javier Bajo, Shudong Sun

2015 18th International Conference on Information Fusion (Fusion) > 830 - 837

2015 18th International Conference on Information Fusion (Fusion)

In this paper, we consider a special multi-source data clustering problem for which the data-points from the same source cannot be grouped into the same cluster, namely cannot link (CL) constraint, and the sizes of the generated clusters are subject to maximum thresholds. No prior information is given about the level of clutter (namely noisy data) or the number of clusters. Particularly, the clusters...

chapter

A Linear-Clustering algorithm for controlling quality of large scale water-level data in Thailand

Nuttapon Pattanavijit, Peerapon Vateekul, Kanoksri Sarinnapakorn

2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE) > 269 - 274

2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Hydro and Agro Informatics Institute (HAII) has installed more than 800 telemetry stations across Thailand to collect water level data for operation tasks and researches, e.g., flooding prevention system. To have an accurate result, it is crucial to control the quality of data by detecting and filtering out anomalies. In our previous work, a data quality management system to capture various types...

chapter

A new online clustering approach for data in arbitrary shaped clusters

Richard Hyde, Plamen Angelov

2015 IEEE 2nd International Conference on Cybernetics (CYBCONF) > 228 - 233

2015 IEEE 2nd International Conference on Cybernetics (CYBCONF)

In this paper we demonstrate a new density based clustering technique, CODSAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical...

chapter

Applicability of clustering techniques on masquerade detection

Reshma Raveendran, Dhanya K. A

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 2343 - 2348

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

In masquerade attack, attacker impersonates legitimate user. Most of the masquerade detection techniques done so far are based on supervised learning techniques. But here in this paper masquerade detection based on unsupervised learning techniques are used. Various clustering algorithms used are K-Means, K-Medoid, Agglomerative clustering algorithm and DBSCAN. A comparative study is done based on...

chapter

A fast clustering method based on multi-splitting grid

Meng Fanyu, Xu Yajing, Gao Zhe, Lin Zhiqing

2014 4th IEEE International Conference on Network Infrastructure and Digital Content > 449 - 452

2014 4th IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)

Clustering algorithms based on Grid are attractive for the task of data partition in spatial database. In the background of big data more and more research focuses on how to solve the conflict between efficiency and accuracy of clustering. Existing Grid-based clustering algorithms generally have a high time efficiency without considering the distribution of the data inside a grid. In this paper, a...

chapter

A Practical Approach on Cleaning-Up Large Data Sets

Marius Barat, Dumitru Bogdan Prelipcean, Dragos Teodor Gavrilut

2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing > 280 - 284

2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)

In this paper we propose a noise detection system based on similarities between instances. Having a data set with instances that belongs to multiple classes, a noise instance denotes a wrongly classified record. The similarity between different labeled instances is determined computing distances between them using several metrics among the standard ones. In order to ensure that this approach is computational...

chapter

Word Sense Induction with Multilingual Features Representation

Lorenzo Albano, Domenico Beneventano, Sonia Bergamaschi

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) > 2 > 343 - 349

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

The use of word senses in place of surface word forms has been shown to improve performance on many computational tasks, including intelligent web search. In this paper we propose a novel approach to automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). Almost all the WSI approaches proposed in the literature dealt with monolingual data and only very few...

chapter

Density clustering based on border-expanding

Dongming Chen, Yun Yan, Dongqi Wang

2014 10th International Conference on Natural Computation (ICNC) > 670 - 674

2014 10th International Conference on Natural Computation (ICNC)

DBSCAN is a clustering algorithm based on density. It can divide regions which have a high density for clusters, shield the noise effectively and discover clusters of arbitrary shape and any size from dataset. However, DBSCAN algorithm needs to traverse dataset to find core objects, so it results in large amount of I/O cost when processing large-scale datasets. A fast algorithm (BEDBSCAN) is developed...

chapter

Big data clustering validity

Mania Tlili, Tarek M. Hamdani

2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR) > 348 - 352

2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR)

Nowadays we communicate in a digital universe. In fact the amount of data (structured and unstructured) is exploding. That's what we call Big Data. The voluminous data are in the most of cases noisy and overlapping, their clustering makes critical challenges. In addition validating resulting partitions is a serious problem. In this paper we present a new fuzzy validity index able to interpret the...

chapter

A clustering approach for detection of ground in micropulse photon-counting LiDAR altimeter data

Jiashu Zhang, John Kerekes, Beata Csatho, Toni Schenk, more

2014 IEEE Geoscience and Remote Sensing Symposium > 177 - 180

IGARSS 2014 - 2014 IEEE International Geoscience and Remote Sensing Symposium

Observations from satellite lidar instruments have provided evidence in the remarkable changes in polar ice sheets on a global scale. The Ice, Cloud and land Elevation Satellite-2 (ICESat-2) is scheduled for launch by NASA in 2017 and will monitor the elevation changes of polar ice sheets and vegetation canopy. To validate ICESat-2's approach of photon-counting laser altimetry, measurements obtained...

chapter

A Clustering Approach to the Discovery of Points of Interest from Geo-Tagged Microblog Posts

Anders Skovsgaard, Darius idlauskas, Christian S. Jensen

2014 IEEE 15th International Conference on Mobile Data Management > 1 > 178 - 188

2014 15th IEEE International Conference on Mobile Data Management (MDM)

Points of interest (PoI) data serves an important role as a foundation for a wide variety of location-based services. Such data is typically obtained from an authoritative source or from users through crowd sourcing. It can be costly to maintain an up-to-date authoritative source, and data obtained from users can vary greatly in coverage and quality. We are also witnessing a proliferation of both...

chapter

Fault diagnosis of the continuous stirred tank heater using fuzzy-possibilistic c-means algorithm

Shen Yin, Jingxin Zhang

2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE) > 2445 - 2450

2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE)

This paper mainly introduces a practical algorithm called fuzzy-possibilistic c-means (FPCM) clustering algorithm. It is based on fuzzy c-means (FCM) clustering algorithm and possibilistic c-means (PCM) clustering algorithm. FPCM algorithm figures out the existing problems of the above two algorithms and produces both memberships and possibilities simultaneously. For example, FPCM algorithm works...

chapter

A new clustering algorithm with adaptive attractor for LIDAR points

Min Wei, Longyu Zhao, Xiaolong Liu

2014 IEEE International Conference on Progress in Informatics and Computing > 21 - 26

2014 International Conference on Progress in Informatics and Computing (PIC)

Clustering is a semi-supervised or unsupervised algorithm for classifying a set of data according to underlying characteristics or similarity. There are many different algorithms for different applications. Each algorithm has its advantages to some special fields. As to the data obtained from an automotive LUX-LIDAR, the existing algorithms are failed to cluster them accurately or efficiently. It...

chapter

DBSCAN: Past, present and future

Kamran Khan, Saif Ur Rehman, Kamran Aziz, Simon Fong, more

The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014) > 232 - 238

2014 Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT)

Data Mining is all about data analysis techniques. It is useful for extracting hidden and interesting patterns from large datasets. Clustering techniques are important when it comes to extracting knowledge from large amount of spatial data collected from various applications including GIS, satellite images, X-ray crystallography, remote sensing and environmental assessment and planning etc. To extract...

chapter

Scalable clustering with adaptive instance sampling

JaeKyung Yang, ByoungJin Yu, MyoungJin Choi

2013 IEEE International Conference on Industrial Engineering and Engineering Management > 1309 - 1313

2013 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)

Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an...

chapter

SNN Input Parameters: How Are They Related?

Guilherme Moreira, Maribel Yasmina Santos, Joao Moura-Pires

2013 International Conference on Parallel and Distributed Systems > 492 - 497

2013 International Conference on Parallel and Distributed Systems (ICPADS)

Nowadays, organizations are facing several challenges when they try to analyze generated data with the aim of extracting useful information. This analytical capacity needs to be enhanced with tools capable of dealing with big data sets without making the analytical process a difficult task. Clustering is usually used, as this technique does not require any prior knowledge about the data. However,...

chapter

4D+SNN: A Spatio-Temporal Density-Based Clustering Approach with 4D Similarity

Ricardo Oliveira, Maribel Yasmina Santos, Joao Moura Pires

2013 IEEE 13th International Conference on Data Mining Workshops > 1045 - 1052

2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW)

Spatio-temporal clustering is a sub field of data mining that is increasingly gaining more scientific attention due to the advances of location-based or environmental devices that register position, time and, in some cases, other semantic attributes. This process pretends to group objects based in their spatial and temporal similarity helping to discover interesting patterns and correlations in large...

chapter

Dynamic Analytics for Spatial Data with an Incremental Clustering Approach

Fernando Mendes, Maribel Yasmina Santos, Joao Moura-Pires

2013 IEEE 13th International Conference on Data Mining Workshops > 552 - 559

2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW)

Several clustering algorithms have been extensively used to analyze vast amounts of spatial data. One of these algorithms is the SNN (Shared Nearest Neighbor), a density-based algorithm, which has several advantages when analyzing this type of data due to its ability of identifying clusters of different shapes, sizes and densities, as well as the capability to deal with noise. Having into account...

INFONA - science communication portal

Search results

Robust Graph-Theoretic Clustering Approaches Using Node-Based Resilience Measures

Operational analysis of k-medoids and k-means algorithms on noisy data

Multi-source data clustering

A Linear-Clustering algorithm for controlling quality of large scale water-level data in Thailand

A new online clustering approach for data in arbitrary shaped clusters

Applicability of clustering techniques on masquerade detection

A fast clustering method based on multi-splitting grid

A Practical Approach on Cleaning-Up Large Data Sets

Word Sense Induction with Multilingual Features Representation

Density clustering based on border-expanding

Big data clustering validity

A clustering approach for detection of ground in micropulse photon-counting LiDAR altimeter data

A Clustering Approach to the Discovery of Points of Interest from Geo-Tagged Microblog Posts

Fault diagnosis of the continuous stirred tank heater using fuzzy-possibilistic c-means algorithm

A new clustering algorithm with adaptive attractor for LIDAR points

DBSCAN: Past, present and future

Scalable clustering with adaptive instance sampling

SNN Input Parameters: How Are They Related?

4D+SNN: A Spatio-Temporal Density-Based Clustering Approach with 4D Similarity

Dynamic Analytics for Spatial Data with an Incremental Clustering Approach

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options