Search results

chapter

A novel structural mass based dissimilarity measure

Peng Fang, Liusheng Huang, Hongli Xu, Shaowei Wang

2016 5th International Conference on Computer Science and Network Technology (ICCSNT) > 304 - 308

2016 5th International Conference on Computer Science and Network Technology (ICCSNT)

Data dependent dissimilarity provides a better closest adaptation than distance measures. When dealing with arbitrary types of data sets especially those with manifold structures, mass-based dissimilarity [1] cannot perform well. Taking the structure into account, this paper introduces a generic structural mass-based dissimilarity which is easily applied to existing algorithms in different missions...

chapter

Clustering of microRNAs Using Rough Hypercuboid Based Fuzzy C-Means

Partha Garai, Pradipta Maji

2016 International Conference on Information Technology (ICIT) > 304 - 308

2016 International Conference on Information Technology (ICIT)

MicroRNAs form a family of single strand RNA molecules having length of approximately 22 nucleotides that are present in all animals and plants. Various studies have revealed that microRNA tend to cluster on chromosomes. In this regard, a novel clustering algorithm is presented in this paper, integrating rough hypercuboid approach with fuzzy c-means. Using the concept of rough hypercuboid equivalence...

chapter

Intrusion detection system using data mining a review

Varsha Singh, Shubha Puthran

2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC) > 587 - 592

2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC)

Everyday huge amount of information are transferred from one network to another, the information may be exposed to attacks. The information and information system should be protected from unauthorized users. To provide and maintain the Confidentiality and Integrity of the information is a very tedious job so Intrusion Detection plays a very important role. Although various methods are used to protect...

chapter

A distributed, scalable parallelization of fuzzy c-means algorithm

Reena Bharathi, S.C. Shirwaikar, Vilas Kharat

2016 IEEE Bombay Section Symposium (IBSS) > 1 - 7

2016 IEEE Bombay Section Symposium (IBSS)

Distributed Applications from different domains like Health care, E-Commerce, science, social networks etc., tend to generate large volumes of heterogeneous data that grow exponentially over a period of time leading to big data sets. Descriptive Analytics, on big data sets, pose a great challenge for traditional data analytical tools, since it is to be performed on the full data set, unlike predictive...

chapter

Accelerated ROCK algorithm

Avli Saxena, Manoj Singh

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE) > 1 - 5

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE)

ROCK is a popular algorithm to cluster categorical data due to its ingenious concept of links between them. The only issue with this method is time complexity. The procedure is inherently slow with maximum iterations N-k. This paper shows how properties of dataset can be utilized to reduce the total iterations by a factor of 10 or more. The reduction is much significant as the size of dataset grows...

chapter

Analysis of Complex Data in Telecommunications Industry

Nayana Gupta, Mohammed Wasid, Rashid Ali

2016 IEEE International Conference on Computer and Information Technology (CIT) > 104 - 107

2016 IEEE International Conference on Computer and Information Technology (CIT)

In this paper, we report an application of data analytics in a real world business case of the telecom industry. This work has been tied up with an IT company in India with a large data set of telecom customers. As part of data analytics, the first task was to perform cleansing of bad and missing data, transforming heterogeneous formats into a unified format, semantic analysis on the data (semantics...

chapter

Automated analysis of flow cytometry data: a systematic review of recent methods

Taher Ahmed Ghaleb, Mawal Ali Mohammed, Emad Ramadan

2016 2nd International Conference on Open Source Software Computing (OSSCOM) > 1 - 7

2016 2nd International Conference on Open Source Software Computing (OSSCOM)

Flow cytometry (FCM) is a very well-known method that is broadly used in clinical and research laboratories. Both clinical and research laboratories have been the target domains of FCM applications. The key research question in this particular field is “how to effectively automate FCM data analysis?”. To answer this question, this paper systematically reviews current advances in the automation of...

chapter

Community Detection for Cold Start Problem in Personalization: Community Detection is Large Social Network Graphs Based on Users’ Structural Similarities and Their Attribute Similarities

Ankush Bhatia

2016 IEEE International Conference on Computer and Information Technology (CIT) > 167 - 171

2016 IEEE International Conference on Computer and Information Technology (CIT)

A persona in a social network is defined as the person's activities and attributes in a social network as seen by others. And a community in a social network is defined as a group of users in that social network which share common interests and are most likely to interact with each other in the network. For community detection, a user's persona and its connections with the other users in a network,...

chapter

Text Document Clustering: The Application of Cluster Analysis to Textual Document

Venkata Srikanth Reddy, Patrick Kinnicutt, Roger Lee

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 1174 - 1179

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Gathering the most relevant data for one's need, from the huge collection of data in the internet is a work of great difficult. To make it easier, we propose an application called text clustering, which is an automatic grouping of text documents into clusters, so that documents within a cluster defines the similarity between them, but they are not similar to documents in other clusters. Most of existing...

chapter

Minimum-volume-regularized weighted symmetric nonnegative matrix factorization for clustering

Tianxiang Gao, Sigurdur Olofsson, Songtao Lu

2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP) > 247 - 251

2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

In recent years, nonnegative matrix factorization (NMF) attracts much attention in machine learning and signal processing fields due to its interpretability of data in a low dimensional subspace. For clustering problems, symmetric nonnegative matrix factorization (SNMF) as an extension of NMF factorizes the similarity matrix of data points directly and outperforms NMF when dealing with nonlinear data...

chapter

Clustering for point pattern data

Nhat-Quang Tran, Ba-Ngu Vo, Dinh Phung, Ba-Tuong Vo

2016 23rd International Conference on Pattern Recognition (ICPR) > 3174 - 3179

2016 23rd International Conference on Pattern Recognition (ICPR)

Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns - sets or multi-sets of unordered elements - that are found in numerous applications and data sources. In this paper, we...

chapter

Event segmentation using MapReduce based big data clustering

M. Omair Shafiq

2016 IEEE International Conference on Big Data (Big Data) > 1857 - 1866

2016 IEEE International Conference on Big Data (Big Data)

Event segmentation is an important step in monitoring and management applications that categorizes different events into different segments. This is important especially when applications, to be monitored and managed, are large-scale, comprehensive and data-intensive in nature. The process of segmentation is based on data clustering which is one of the key data mining methods used these days. There...

chapter

A Theoretical Analysis of the Fuzzy K-Means Problem

Johannes Blomer, Sascha Brauer, Kathrin Bujna

2016 IEEE 16th International Conference on Data Mining (ICDM) > 805 - 810

2016 IEEE 16th International Conference on Data Mining (ICDM)

One of the most popular fuzzy clustering techniques is the fuzzy K-means algorithm (also known as fuzzy-c-means or FCM algorithm). In contrast to the K-means and K-median problem, the underlying fuzzy K-means problem has not been studied from a theoretical point of view. In particular, there are no algorithms with approximation guarantees similar to the famous K-means++ algorithm known for the fuzzy...

chapter

Analysis of product Twitter data though opinion mining

Roshan Fernandes, Rio D'Souza

2016 IEEE Annual India Conference (INDICON) > 1 - 5

2016 IEEE Annual India Conference (INDICON)

In recent years, there is a rapid growth in online communication. There are many social networking sites and related mobile applications, and some more are still emerging. Huge amount of data is generated by these sites everyday and this data can be used as a source for various analysis purposes. Twitter is one of the most popular networking sites with millions of users. There are users with different...

chapter

Subspace Clustering Ensembles through Tensor Decomposition

Dominik Mautz, Christian Bohm, Claudia Plant

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1225 - 1234

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

In recent years many different subspace clusteringalgorithms and related methods have been proposed. Theypromise to not only find hidden structures in data sets, but also toselect for each structure the features, which are most prominent. Yet, most of these methods suffer from the same problem:finding a satisfactory clustering result heavily depends on anadequate configuration of the parameters. In...

chapter

MoCham: Robust Hierarchical Clustering Based on Multi-objective Optimization

Tomas Barton, Tomas Bruna, Pavel Kordik

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 831 - 838

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Many clustering evaluation methods are computed as a ratio between two objectives, typically these objectives express the compactness of all clusters while trying to maximize the separation between individual clusters. However, the clustering process itself is typically implemented as a single objective problem: optimizing a linear combination of between-points closeness. We propose MoCham - a hierarchical...

chapter

The Study of K-Means Based on Hybrid SA-PSO Algorithm

Xingang Wang, Qi Sun

2016 9th International Symposium on Computational Intelligence and Design (ISCID) > 2 > 211 - 214

2016 9th International Symposium on Computational Intelligence and Design (ISCID)

This paper introduces the relative principium of K-Means algorithm, simulated annealing (SA) algorithm and particle swarm optimization (PSO) algorithm at first. Then, in allusion to the influence of the initial value of the K-Means algorithm on the optimal solution of the algorithm, a hybrid algorithm of K-Means based on SA-PSO is proposed. The new algorithm uses the advantage of jumping out of local...

chapter

Hybrid Clustering Based on a Graph Model

Hongjun Su, Hong Zhang

2016 9th International Symposium on Computational Intelligence and Design (ISCID) > 1 > 242 - 245

2016 9th International Symposium on Computational Intelligence and Design (ISCID)

A hybrid clustering approach is proposed for processing image-like data such as plots in flow cytometry. Clustering or partitioning data into relatively homogeneous and coherent subpopulations can be an effective pre-processing method to achieve data analysis tasks such as pattern recognition and classification. Our method uses a graph to model the initial manual partition of the dataset. Based on...

chapter

Random Projection Clustering on Streaming Data

Lee A. Carraher, Philip A. Wilsey, Anindya Moitra, Sayantan Dey

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 708 - 715

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Clustering streaming data has gained importance in recent years due to an expanding opportunity to discover knowledge in widely available data streams. As streams are potentially evolving and unbounded sequence of data objects, clustering algorithms capable of performing fast and incremental processing of data points are necessary. This paper presents a method of clustering high-dimensional data streams...

chapter

Using parallel hierarchical clustering to address spatial big data challenges

Alan Woodley, Ling-Xiang Tang, Shlomo Geva, Richi Nayak, more

2016 IEEE International Conference on Big Data (Big Data) > 2692 - 2698

2016 IEEE International Conference on Big Data (Big Data)

Clustering can help to make large datasets more manageable by grouping together similar objects. However, most clustering approaches are unable to scale to very large datasets (e.g. more than 10 million objects). The K-Tree is a data structure and clustering algorithm that has proven to be scalable with large streaming datasets. Here, we apply the K-Tree to spatial data (satellite images) and extend...

INFONA - science communication portal

Search results

A novel structural mass based dissimilarity measure

Clustering of microRNAs Using Rough Hypercuboid Based Fuzzy C-Means

Intrusion detection system using data mining a review

A distributed, scalable parallelization of fuzzy c-means algorithm

Accelerated ROCK algorithm

Analysis of Complex Data in Telecommunications Industry

Automated analysis of flow cytometry data: a systematic review of recent methods

Community Detection for Cold Start Problem in Personalization: Community Detection is Large Social Network Graphs Based on Users’ Structural Similarities and Their Attribute Similarities

Text Document Clustering: The Application of Cluster Analysis to Textual Document

Minimum-volume-regularized weighted symmetric nonnegative matrix factorization for clustering

Clustering for point pattern data

Event segmentation using MapReduce based big data clustering

A Theoretical Analysis of the Fuzzy K-Means Problem

Analysis of product Twitter data though opinion mining

Subspace Clustering Ensembles through Tensor Decomposition

MoCham: Robust Hierarchical Clustering Based on Multi-objective Optimization

The Study of K-Means Based on Hybrid SA-PSO Algorithm

Hybrid Clustering Based on a Graph Model

Random Projection Clustering on Streaming Data

Using parallel hierarchical clustering to address spatial big data challenges

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options