Wyniki wyszukiwania

rozdział

Accelerated Hierarchical Density Based Clustering

Leland McInnes, John Healy

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 33 - 42

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

We present an accelerated algorithm for hierarchical density based clustering. Our new algorithm improves upon HDBSCAN*, which itself provided a significant qualitative improvement over the popular DBSCAN algorithm. The accelerated HDBSCAN* algorithm provides comparable performance to DBSCAN, while supporting variable density clusters, and eliminating the need for the difficult to tune distance scale...

rozdział

A missing data imputation approach using clustering and maximum likelihood estimation

Muammer Albayrak, Kemal Turhan, Burcin Kurt

2017 Medical Technologies National Congress (TIPTEKNO) > 1 - 4

2017 Medical Technologies National Congress (TIPTEKNO)

Missing data is a data mining problem that adversely affects data analysis and decision making processes that are frequently encountered in healthcare data for a variety of reasons. Missing data is still an important research topic because the success of the method is influenced by many factors such as the characteristics of the data and the type of the missing data. In this study, a clustering and...

rozdział

Classification of cognitive state using clustering based maximum margin feature selection framework

J. Siva Ramakrishna, Hariharan Ramasangu

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1092 - 1096

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Over the past few years, the dimensionality of functional MRI (fMRI) effects the analysis of brain data. In the field of machine learning and statistical analysis, classification of objects plays a significant role. Machine learning classifiers are used to discover the class of new data points from a set of data points. The application of learning techniques on fMRI data alleviates to cognitive state...

rozdział

Mining Social Media Data Using Topological Data Analysis

Khaled Almgren, Minkyu Kim, Jeongkyu Lee

2017 IEEE International Conference on Information Reuse and Integration (IRI) > 144 - 153

2017 IEEE International Conference on Information Reuse and Integration (IRI)

Topological data analysis is a noble method to analyze high-dimensional qualitative data using a set of properties from topology. In this paper, we explore the feasibility of topological data analysis for mining social media data by investigating the problem of image popularity. We randomly crawl images from Instagram, convert their captions to 300 dimensional numerical vectors using Word2vec, calculate...

rozdział

The initial analysis of failures emerging in production process for further data mining analysis

M. Nemeth, G. Michalconok

2017 21st International Conference on Process Control (PC) > 210 - 215

2017 21st International Conference on Process Control (PC)

The aim of this paper is to examine possibilities for the initial data analyses of the failure data from industrial production process. To perform the initial data analysis of the data from production process we have used graphical statistical method and also data mining methods like drill-down analysis and cluster analysis. Before applying mentioned techniques and methods it was necessary to know...

rozdział

Data driven multi-agent m-health system to characterize the daily activities of elderly people

Solange Mendes, Jonas Queiroz, Paulo Leitao

2017 12th Iberian Conference on Information Systems and Technologies (CISTI) > 1 - 6

2017 12th Iberian Conference on Information Systems and Technologies (CISTI)

With the continuous growing of aging population, the society is facing new challenges, namely the implementation of healthcare services for older people, as well as the promotion of the active aging and well-being. These challenges imply the optimization of these services through biomedical, physical, psychological and socio-environmental interventions. ICT technologies can support the implementation...

artykuł

Defining Data Clusters

Ali Jadbabaie

Computer > 2016 > 49 > 12 > 15

This installment of Computer's series highlighting the work published in IEEE Computer Society journals comes from IEEE Transactions on Network Science and Engineering

rozdział

A distributed, scalable parallelization of fuzzy c-means algorithm

Reena Bharathi, S.C. Shirwaikar, Vilas Kharat

2016 IEEE Bombay Section Symposium (IBSS) > 1 - 7

2016 IEEE Bombay Section Symposium (IBSS)

Distributed Applications from different domains like Health care, E-Commerce, science, social networks etc., tend to generate large volumes of heterogeneous data that grow exponentially over a period of time leading to big data sets. Descriptive Analytics, on big data sets, pose a great challenge for traditional data analytical tools, since it is to be performed on the full data set, unlike predictive...

rozdział

Analysis of Complex Data in Telecommunications Industry

Nayana Gupta, Mohammed Wasid, Rashid Ali

2016 IEEE International Conference on Computer and Information Technology (CIT) > 104 - 107

2016 IEEE International Conference on Computer and Information Technology (CIT)

In this paper, we report an application of data analytics in a real world business case of the telecom industry. This work has been tied up with an IT company in India with a large data set of telecom customers. As part of data analytics, the first task was to perform cleansing of bad and missing data, transforming heterogeneous formats into a unified format, semantic analysis on the data (semantics...

rozdział

Automated analysis of flow cytometry data: a systematic review of recent methods

Taher Ahmed Ghaleb, Mawal Ali Mohammed, Emad Ramadan

2016 2nd International Conference on Open Source Software Computing (OSSCOM) > 1 - 7

2016 2nd International Conference on Open Source Software Computing (OSSCOM)

Flow cytometry (FCM) is a very well-known method that is broadly used in clinical and research laboratories. Both clinical and research laboratories have been the target domains of FCM applications. The key research question in this particular field is “how to effectively automate FCM data analysis?”. To answer this question, this paper systematically reviews current advances in the automation of...

rozdział

Robust Local Scaling Using Conditional Quantiles of Graph Similarities

Jayaraman J. Thiagarajan, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Bhavya Kailkhura

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 762 - 769

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Spectral analysis of neighborhood graphs is one of the most widely used techniques for exploratory data analysis, with applications ranging from machine learning to social sciences. In such applications, it is typical to first encode relationships between the data samples using an appropriate similarity function. Popular neighborhood construction techniques such as k-nearest neighbor (k-NN) graphs...

rozdział

The Study of K-Means Based on Hybrid SA-PSO Algorithm

Xingang Wang, Qi Sun

2016 9th International Symposium on Computational Intelligence and Design (ISCID) > 2 > 211 - 214

2016 9th International Symposium on Computational Intelligence and Design (ISCID)

This paper introduces the relative principium of K-Means algorithm, simulated annealing (SA) algorithm and particle swarm optimization (PSO) algorithm at first. Then, in allusion to the influence of the initial value of the K-Means algorithm on the optimal solution of the algorithm, a hybrid algorithm of K-Means based on SA-PSO is proposed. The new algorithm uses the advantage of jumping out of local...

rozdział

Hierarchical Aggregation Approach for Distributed Clustering of Spatial Datasets

Malika Bendechache, Nhien-An Le-Khac, M-Tahar Kechadi

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) > 1098 - 1103

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

In this paper, we present a new approach of distributed clustering for spatial datasets, based on an innovative and efficient aggregation technique. This distributed approach consists of two phases: 1) local clustering phase, where each node performs a clustering on its local data, 2) aggregation phase, where the local clusters are aggregated to produce global clusters. This approach is characterised...

rozdział

Triclustering: An evolution of clustering

N. Narmadha, R. Rathipriya

2016 Online International Conference on Green Engineering and Technologies (IC-GET) > 1 - 4

2016 Online International Conference on Green Engineering and Technologies (IC-GET)

This paper, deals with a study of data mining techniques such as clustering, biclustering and triclustering. A large number of clustering approaches have been proposed for analysis of gene expression. However, the results of the application of standard clustering methods are limited. For this reason, concurrent clustering such as biclustering to find sub-matrices that are a subset of rows and a subset...

artykuł

Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction

Guoxu Zhou, Andrzej Cichocki, Yu Zhang, Danilo P. Mandic

IEEE Transactions on Neural Networks and Learning Systems > 2016 > 27 > 11 > 2426 - 2439

Real-world data are often acquired as a collection of matrices rather than as a single matrix. Such multiblock data are naturally linked and typically share some common features while at the same time exhibiting their own individual features, reflecting the underlying data generation mechanisms. To exploit the linked nature of data, we propose a new framework for common and individual feature extraction...

rozdział

Time-series data analysis - a few case studies with bio-signals

Goutam Chakraborty

2016 2nd International Conference on Science in Information Technology (ICSITech) > 1

2016 2nd International Conference on Science in Information Technology (ICSITech)

A dynamic system is represented by its outputs, as time-series data. Modeling of the time-series is an important data-mining task for prediction in future, or detection of deviation from normal behavior (anomaly). Clustering of multiple time-series leads to understanding the system, as well as improve efficiency of monitoring.

rozdział

Characterization of Football Supporters from Twitter Conversations

Diogo F. Pacheco, Diego Pinheiro, Fernando B. de Lima-Neto, Eraldo Ribeiro, więcej

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI) > 169 - 176

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

Football (aka Soccer) is the most popular sport in the world. The popularity of the sport leads to several stories (some perhaps anecdotal) about supporters behaviors and to the emergence of rivalries such as the famous Barcelona-Real Madrid (in Spain). Little however has been done to characterize/profile online users' behaviors as football supporters and use them as an aggregate measure to club characterization...

rozdział

Efficient Large Scale Clustering Based on Data Partitioning

Malika Bendechache, M-Tahar Kechadi, Nhien-An Le-Khac

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) > 612 - 621

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high complexity of some algorithms. For instance, some algorithms may have linear complexity but they require the domain knowledge in order to determine their input...

rozdział

A syllabus on data mining and machine learning with applications to cybersecurity

Anna Epishkina, Sergey Zapechnikov

2016 Third International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC) > 194 - 199

2016 Third International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC)

Big data analytics are very fruitful for solving problems in cybersecurity. We have analyzed modern trends in intelligent security systems research and practice and worked out a syllabus for a new university course in the area of data mining and machine learning with applications to cybersecurity. The course is for undergraduate and graduate students studying the cybersecurity. The main objective...

rozdział

Knowing your enemies: leveraging data analysis to expose phishing patterns against a major US financial institution

Javier Vargas, Alejandro Correa Bahnsen, Sergio Villegas, Daniel Ingevaldson

2016 APWG Symposium on Electronic Crime Research (eCrime) > 1 - 10

2016 APWG Symposium on Electronic Crime Research (eCrime)

Phishing attacks against financial institutions constitutes a major concern and forces them to invest thousands of dollars annually in prevention, detection and takedown of these kinds of attacks. This operation is so massive and time critical that there is usually no time to perform analysis to look for patterns and correlations between attacks. In this work we summarize our findings after applying...

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Accelerated Hierarchical Density Based Clustering

A missing data imputation approach using clustering and maximum likelihood estimation

Classification of cognitive state using clustering based maximum margin feature selection framework

Mining Social Media Data Using Topological Data Analysis

The initial analysis of failures emerging in production process for further data mining analysis

Data driven multi-agent m-health system to characterize the daily activities of elderly people

Defining Data Clusters

A distributed, scalable parallelization of fuzzy c-means algorithm

Analysis of Complex Data in Telecommunications Industry

Automated analysis of flow cytometry data: a systematic review of recent methods

Robust Local Scaling Using Conditional Quantiles of Graph Similarities

The Study of K-Means Based on Hybrid SA-PSO Algorithm

Hierarchical Aggregation Approach for Distributed Clustering of Spatial Datasets

Triclustering: An evolution of clustering

Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction

Time-series data analysis - a few case studies with bio-signals

Characterization of Football Supporters from Twitter Conversations

Efficient Large Scale Clustering Based on Data Partitioning

A syllabus on data mining and machine learning with applications to cybersecurity

Knowing your enemies: leveraging data analysis to expose phishing patterns against a major US financial institution

Opcje filtrowania

Data publikacji

Dostępność treści

Typ publikacji

Słowa kluczowe

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Dostępność treści

Typ publikacji

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu