Search results

chapter

Efficient genetic K-Means clustering for health care knowledge discovery

Ahmed Alsayat, Hoda El-Sayed

2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA) > 45 - 52

2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)

Data mining and machine learning are becoming the most interesting research areas and increasingly popular in health organizations. The hidden patterns among patients data can be extracted by applying data mining. The techniques and tools of data mining are very helpful as they provide health care professionals with significant knowledge toward a decision. Researchers have shown several utilities...

chapter

A bi-directional sampling based on K-means method for imbalance text classification

Jia Song, Xianglin Huang, Sijun Qin, Qing Song

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) > 1 - 5

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)

This paper studies the imbalanced data classifycation problem and proposes bi-directional sampling based on clustering (BDSK) for the imbalanced data classification. This algorithm combines SMOTE over-sampling algorithm and under-sampling algorithm based on K-Means to solve the within-class imbalance problem and the between-class imbalance problem. It not only avoid induce too much noise but also...

chapter

Big data and clustering algorithms

V W Ajin, Lekshmy D Kumar

2016 International Conference on Research Advances in Integrated Navigation Systems (RAINS) > 1 - 5

2016 International Conference on Research Advances in Integrated Navigation Systems (RAINS)

Data mining is the method which is useful for extracting useful information and data is extorted, but the classical data mining approaches cannot be directly used for big data due to their absolute complexity. The data that is been formed by numerous scientific applications and incorporated environment has grown rapidly not only in size but also in variety in recent era. The data collected is of very...

chapter

Classification of movement patterns of soccer referees using K-means

Cengiz Kurkcu, Umit Deniz Ulusar

2016 24th Signal Processing and Communication Application Conference (SIU) > 137 - 140

2016 24th Signal Processing and Communication Application Conference (SIU)

Many sports are being followed by large crowds and soccer is the most popular one among them. During the game, referee is responsible to protect players' health and to ensure proper implementation of the rules. In order to be able to achieve these tasks, referee needs to have tremendous physical and mental fitness, has to be able to interpret events according to the spirit of the rules and needs to...

chapter

Vritthi - a theoretical framework for IT recruitment based on machine learning techniques applied over Twitter, LinkedIn, SPOJ and GitHub profiles

Animesh Giri, Abhiram Ravikumar, Sneha Mote, Rahul Bharadwaj

2016 International Conference on Data Mining and Advanced Computing (SAPIENCE) > 1 - 7

2016 International Conference on Data Mining and Advanced Computing (SAPIENCE)

In this model, we propose an innovative recruitment system using social networking websites like Twitter and LinkedIn along with code repository hosting website GitHub and competitive coding platforms like SPOJ. It is aimed to develop advanced search engines to automatically sort the job-seekers based on job offer requirements using various data mining and machine learning techniques. Vritthi allows...

chapter

A proposed framework using CAC algorithm to predict systemic lupus erythematosus (SLE)

S. Gomathi, V. Narayani

2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave) > 1 - 6

2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)

The paper proposes new framework to predict the chronic Lupus disease. The new algorithm has been proposed which is best suitable for supervised, semi supervised and unsupervised data. The algorithm is named as CAC (Clustering Association and Classification). The best algorithms are selected based on the accuracy. The 8 major attributes to diagnose lupus has been identified and considered for prediction...

chapter

Recognition and anticipation of cancer and non-cancer prophecy using data mining approach

R. Kaviarasi, A. Valarmathi

2016 International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS) > 1 - 4

2016 International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS)

Lung cancer is the number one cause of cancer deaths in both men and women in the worldwide. The two types of lung cancer, which grow and spread differently, are the small cell lung cancers (SCLC) and non-small cell lung cancers (NSCLC). Treatment of lung cancer can involve a combination of surgery, chemotherapy, and radiation therapy as well as newer experimental methods. The general prognosis of...

chapter

Sentiment on social interactions using linear and non-linear clustering

S. Surya Kumari, G. Anjan Babu

2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB) > 177 - 181

2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB)

Social media analytics play a major role in e-commerce for extracting the useful information of a product or service. Opinion mining has become the key process of social media analytics. Twitter is a big online social activity platform where millions of people share their opinions. In this paper two clustering techniques, k-means and DBSCAN, are applied to an annotated Twitter dataset in order to...

chapter

A comparative study of K-Means, DBSCAN and OPTICS

Hari Krishna Kanagala, V.V. Jaya Rama Krishnaiah

2016 International Conference on Computer Communication and Informatics (ICCCI) > 1 - 6

2016 International Conference on Computer Communication and Informatics

In view of today's information available, recent progress in data mining research has lead to the development of various efficient methods for mining interesting patterns in large databases. It plays a vital role in knowledge discovery process by analyzing the huge data from various sources and summarizing it into useful information. It is helpful for analyzing the volumes of data in different domains...

chapter

Clustering of imbalanced moodle data for early alert of student failure

Sabina Sisovic, Maja Matetic, Marija Brkic Bakaric

2016 IEEE 14th International Symposium on Applied Machine Intelligence and Informatics (SAMI) > 165 - 170

2016 IEEE 14th International Symposium on Applied Machine Intelligence and Informatics (SAMI)

This paper is an attempt of applying EDM methods on Moodle data in order to detect specific behaviours within student groups with the tendency to fail the course. The research is conducted on Moodle logs gathered in the blended course Programming 1. Extracting and using crucial information on time can be a turning point for students in at-risk stage, which is what we tried to achieve in this research.

chapter

A classification method to classify high dimensional data

Amit Gupta, Naganna Chetty, Shraddha Shukla

2015 International Conference on Computing, Communication and Security (ICCCS) > 1 - 6

2015 International Conference on Computing, Communication and Security (ICCCS)

The rapid computerization and advancement in the technology has led to huge amount of data in the databases. Research has shown that the amount of data in the world doubles in every 20 months. However, this available data consists of large number of noise values and thus, cannot be directly used. The extraction of information from the vast pool of data has emerged a major challenge.

chapter

A hybrid outlier detection algorithm based on partitioning clustering and density measures

Hamada Rizk, Sherin Elgokhy, Amany Sarhan

2015 Tenth International Conference on Computer Engineering & Systems (ICCES) > 175 - 181

2015 Tenth International Conference on Computer Engineering & Systems (ICCES)

Outlier detection is an important issue in the realm of data mining. Several applications relay on outlier detection such as intrusion detection, fraud detection, medical and public health data, image processing, etc. Clustering-based outlier detection algorithms are considered as the most important outlier detection approaches. They provide high detection rate, however, they suffer from high false...

chapter

A novel algorithm DBCAPSIC for clustering non-numeric data

Jinkun Geng, Daren Ye, Ping Luo

2015 IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 295 - 304

2015 IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

Data mining techniques are playing an important role in the analysis of mass network information and big data nowadays. The cluster analysis, as a main kind of method in data mining, draws great interest from researchers of various fields who proposed many algorithms such as k-means algorithm and its variants, density-based algorithm and its variants. However, these algorithms all have their own problems...

chapter

Data mining model for early fruit diseases detection

Milos Ilic, Petar Spalevic, Mladen Veinovic, Abdolkarim Abdala M. Ennaas

2015 23rd Telecommunications Forum Telfor (TELFOR) > 910 - 913

2015 23rd Telecommunications Forum Telfor (TELFOR)

Automatic methods for an early detection of plant diseases could be vital for precise fruit protection. Traditionally the agriculture expert's knowledge is descriptive and experiment based, therefore it is difficult to describe it mathematically and subsequently build decision system which can replace it. Key parameters of decision based fruit protection system could differ for classes of plants and...

chapter

Principal Component Analysis and Clustering Based Indoor Localizaion

Dong Liang, Jingkang Yang, Rui Xuan, Zhaojing Zhang, more

2015 IEEE International Conference on Data Mining Workshop (ICDMW) > 1103 - 1108

2015 IEEE International Conference on Data Mining Workshop (ICDMW)

This paper proposes an improved method which applies principal components analysis (PCA) algorithm to an existing fingerprinting localization method based on iterative K-means, grid scoring (KS) and AP scoring (AS). In the off-line phase, the suggested method evaluates the localization capability of every access point (AP) for the first step, and then generates only a few new principal components...

chapter

Mining Massive Vector Data on Single Instruction Multiple Data Microarchitectures

Christian Bohm, Claudia Plant

2015 IEEE International Conference on Data Mining Workshop (ICDMW) > 597 - 606

2015 IEEE International Conference on Data Mining Workshop (ICDMW)

Current microarchitectures are equipped with SIMD instruction sets enabling massive data parallelism within each core. Instruction sets like AVX or SSE operate on large reserved registers and support a wide range of parallel arithmetic or logical operations enabling up to 16 double precision floating point operations per clock cycle. Current data mining applications are usually far from fully exploiting...

chapter

Soil data clustering by using K-means and fuzzy K-means algorithm

Elma Hot, Vesna Popovic-Bugarin

2015 23rd Telecommunications Forum Telfor (TELFOR) > 890 - 893

2015 23rd Telecommunications Forum Telfor (TELFOR)

A problem of soil clustering and spatial representation of the obtained results, based on in-situ measurements of physical and chemical characteristics of soil, is analysed in the paper. K-means and fuzzy K-means algorithms are adapted for the soil data clastering. Database of soil samples sampled in Montenegro is used for comparative analysis of the used algorithm. Classified soil data are presented...

chapter

Mining the relation between dorm arrangement and student performance

Man Li, Ruisheng Shi

2015 IEEE International Conference on Big Data (Big Data) > 2344 - 2347

2015 IEEE International Conference on Big Data (Big Data)

This paper discusses the relation between dorm arrangement and student performance. One of the unsupervised learning algorithms, k-means algorithm, is mainly used in the process of analysis. Students are clustered into several clusters according to their similarity of performance scores. This paper analyzes the result of clustering by comparing it with actual dorm arrangement. In the end, drawbacks...

chapter

Accelerating Medoids-based clustering with the Intel Many Integrated Core architecture

Timofey Rechkalov, Mikhail Zymbler

2015 9th International Conference on Application of Information and Communication Technologies (AICT) > 413 - 417

2015 9th International Conference on Application of Information and Communication Technologies (AICT)

The Partition Around Medoids (PAM) is a variation of well known k-Means clustering algorithm where center of each cluster should be chosen as an object of clustered set of objects. PAM is used in a wide spectrum of applications, e.g. text analysis, bioinformatics, intelligent transportation systems, etc. There are approaches to speed up k-Means and PAM algorithms by means of graphic accelerators but...

chapter

A methodology for classifying visitors to an amusement park

Gustavo Dejean

2015 IEEE Conference on Visual Analytics Science and Technology (VAST) > 151 - 152

2015 IEEE Conference on Visual Analytics Science and Technology (VAST)

The main contribution of this work is showing how to obtain a classification of visitors to an amusement park by using cluster analysis and visualization techniques. The selection of variables for K-means algorithm and the results obtained are visually analyzed in dispersion graphs according to their Principal Components, in boxplots and in a Linear Model so as to fine-tune a result that can explain...

INFONA - science communication portal

Search results

Efficient genetic K-Means clustering for health care knowledge discovery

A bi-directional sampling based on K-means method for imbalance text classification

Big data and clustering algorithms

Classification of movement patterns of soccer referees using K-means

Vritthi - a theoretical framework for IT recruitment based on machine learning techniques applied over Twitter, LinkedIn, SPOJ and GitHub profiles

A proposed framework using CAC algorithm to predict systemic lupus erythematosus (SLE)

Recognition and anticipation of cancer and non-cancer prophecy using data mining approach

Sentiment on social interactions using linear and non-linear clustering

A comparative study of K-Means, DBSCAN and OPTICS

Clustering of imbalanced moodle data for early alert of student failure

A classification method to classify high dimensional data

A hybrid outlier detection algorithm based on partitioning clustering and density measures

A novel algorithm DBCAPSIC for clustering non-numeric data

Data mining model for early fruit diseases detection

Principal Component Analysis and Clustering Based Indoor Localizaion

Mining Massive Vector Data on Single Instruction Multiple Data Microarchitectures

Soil data clustering by using K-means and fuzzy K-means algorithm

Mining the relation between dorm arrangement and student performance

Accelerating Medoids-based clustering with the Intel Many Integrated Core architecture

A methodology for classifying visitors to an amusement park

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options