Search results

Items from 1 to 20 out of 128 results

chapter

Clustering of College Students Based on Improved K-Means Algorithm

Zhongxiang Fan, Yan Sun

2016 International Computer Symposium (ICS) > 676 - 679

2016 International Computer Symposium (ICS)

Many colleges have accumulated a large amount of information, such as achievement data and consumption records. According to the above information, we attempt to identify the student group from various aspects. Given this, we can acquire the characteristics of students in different groups. In this way, the college can have a better understanding of students to accomplish the reasonable management...

chapter

MLP-based undersampling technique for imbalanced learning

Varsha Babar, Roshani Ade

2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT) > 142 - 147

2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT)

The imbalanced learning problem is becoming pervasive in today's data mining applications. This problem refers to the uneven distribution of instances among the classes which poses difficulty in the classification of rare instances. Several undersampling as well as oversampling methods were proposed to deal with such imbalance. Many undersampling techniques do not consider distribution of information...

chapter

Iterative sparse matrix-vector multiplication on in-memory cluster computing accelerated by GPUs for big data

Jiwu Peng, Zheng Xiao, Cen Chen, Wangdong Yang

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) > 1454 - 1460

2016 12th International Conference on Natural Computation and 13th Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

Iterative SpMV (ISpMV) is a key operation in many graph-based data mining algorithms and machine learning algorithms. Along with the development of big data, the matrices can be so large, perhaps billion-scale, that the SpMV can not be implemented in a single computer. Therefore, it is a challenging issue to implement and optimize SpMV for large-scale data sets. In this paper, we used an in-memory...

chapter

Empirical analysis and improvement of density based clustering algorithm in data streams

Madhu Shukla, Y. P. Kosta

2016 International Conference on Inventive Computation Technologies (ICICT) > 3 > 1 - 4

2016 International Conference on Inventive Computation Technologies (ICICT)

Data mining has gained much importance in the field of research these days. It makes perfect blend for analyzing data of any fields and provide decision based output. Data generation and storage these days are done at high speed. Non stationary systems play holistic role in providing such data. Availability of such data creates scope of analysis for researchers. Such data which are continuous, unbounded,...

chapter

SaFe-NeC: A scalable and flexible system for network data characterization

Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, more

NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium > 812 - 816

NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium

Nowadays, large volumes of data and measurements are being continuously generated by computer and telecommunication networks, but such volumes make it difficult to extract meaningful knowledge from them. This paper presents SaFe-NeC, an innovative methodology for analyzing network traffic by exploiting data mining techniques, i.e. clustering and classification algorithms, focusing on self-learning...

chapter

MapReduce Model of Improved K-Means Clustering Algorithm Using Hadoop MapReduce

Nadeem Akthar, Mohd Vasim Ahamad, Shahbaaz Ahmad

2016 Second International Conference on Computational Intelligence & Communication Technology (CICT) > 192 - 198

2016 Second International Conference on Computational Intelligence & Communication Technology (CICT)

In today's digital world scenario, digital data is coming in and going out faster than ever before. This data is of no use until we extract some useful content from it. But, it is impractical and inefficient to use traditional database management techniques on big data. That's why, big data technologies like Hadoop comes to existence. Hadoop is an open source framework, which can be used to process...

chapter

A comparative study of K-Means, DBSCAN and OPTICS

Hari Krishna Kanagala, V.V. Jaya Rama Krishnaiah

2016 International Conference on Computer Communication and Informatics (ICCCI) > 1 - 6

2016 International Conference on Computer Communication and Informatics

In view of today's information available, recent progress in data mining research has lead to the development of various efficient methods for mining interesting patterns in large databases. It plays a vital role in knowledge discovery process by analyzing the huge data from various sources and summarizing it into useful information. It is helpful for analyzing the volumes of data in different domains...

chapter

A survey on online Stock forum using subspace clustering

G. Shyamala, N. Pooranam

2016 International Conference on Computer Communication and Informatics (ICCCI) > 1 - 6

2016 International Conference on Computer Communication and Informatics

Financial stock Data Analysis and future prediction in terms of Sentiments is great challenge in the big data research. Among the unlabelled opinion, opinion classification in terms of unsupervised learning algorithm will lead to classification error as data is sparse and high dimensional. To overcome this problem, the sentiment analysis to extract the opinion of each word in the stock data has been...

chapter

DDBSCAN: Different Densities-Based Spatial Clustering of Applications with Noise

Mohammad F. Hassanin, Mohamed Hassan, Abdalla Shoeb

2015 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) > 401 - 404

2015 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT)

Recent advances in using computer with different fields of sciences produced huge amounts of data. These data represent as an analysis tool and key to overcome many problems. Clustering is a primary process to analyze the data as well as, it's a preprocessing step before other techniques like classification. Density-Based clustering algorithms have advantages like clustering any arbitrary shapes and...

chapter

A hybrid outlier detection algorithm based on partitioning clustering and density measures

Hamada Rizk, Sherin Elgokhy, Amany Sarhan

2015 Tenth International Conference on Computer Engineering & Systems (ICCES) > 175 - 181

2015 Tenth International Conference on Computer Engineering & Systems (ICCES)

Outlier detection is an important issue in the realm of data mining. Several applications relay on outlier detection such as intrusion detection, fraud detection, medical and public health data, image processing, etc. Clustering-based outlier detection algorithms are considered as the most important outlier detection approaches. They provide high detection rate, however, they suffer from high false...

chapter

Clustering on Big Data Using Hadoop MapReduce

Nadeem Akthar, Mohd Vasim Ahamad, Shahbaz Khan

2015 International Conference on Computational Intelligence and Communication Networks (CICN) > 789 - 795

2015 International Conference on Computational Intelligence and Communication Networks (CICN)

With the phenomenal increase in digital data, it is inefficient to run the traditional clustering algorithms on separate servers. To deal with this problem, researchers are migrating to distribute environment to implement the traditional clustering algorithms, more specifically K-means clustering. In traditional K Means Clustering, the problem of instability caused by the random initial centers exists...

chapter

Application of clustering algorithm on TV programmes preference grouping of subscribers

Haiyue Zhang, Jianping Chai, Yan Wang, Min An, more

2015 IEEE International Conference on Computer and Communications (ICCC) > 40 - 44

2015 IEEE International Conference on Computer and Communications (ICCC)

With the development of digital cable interactive business and the diversification of the customers' demand, grouping TV programmes based on preferences of users effectively is vital for market segmentation and differentiation. The study summarizes the main principle and characteristic of clustering algorithm, and uses K-Means algorithm to show TV programmes preference grouping based on 52392 subscribers...

chapter

Density K-means: A new algorithm for centers initialization for K-means

Xv Lan, Qian Li, Yi Zheng

2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS) > 958 - 961

2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)

K-means is one of the most significant clustering algorithms in data mining. It performs well in many cases, especially in the massive data sets. However, the result of clustering by K-means largely depends upon the initial centers, which makes K-means difficult to reach global optimum. In this paper, we developed a novel algorithm based on finding density peaks to optimize the initial centers for...

chapter

Click based inferring of user search goals using pseudo document

Harshada P. Bhambure, Mandar Mokashi

2015 International Conference on Computer, Communication and Control (IC4) > 1 - 7

2015 International Conference on Computer, Communication and Control (IC4)

The user enters any query to find desired information. To discover number of user search goals and representing each goal with some keyword, we first infer user search goals for a query by clustering feedback sessions. For that, we use a concept of pseudo document, which is the revised version of feedback session. Then the user search goals are determined by clustering the pseudo documents and it...

chapter

Analysis and evaluation of outlier detection algorithms in data streams

Madhu Shukla, Y. P. Kosta, Prashant Chauhan

2015 International Conference on Computer, Communication and Control (IC4) > 1 - 8

2015 International Conference on Computer, Communication and Control (IC4)

Data mining is one of the most exciting fields of research for the researcher. As data is getting digitized, systems are getting connected and integrated, scope of data generation and analytics has increased exponentially. Today, most of the systems generate non-stationary data of huge, size, volume, occurrence speed, fast changing etc. these kinds of data are called data streams. One of the most...

chapter

Cluster analysis based on opinion mining

Xiaoye Wang, Xiaorui Chai, Ching-Hsien Hsu, Yingyuan Xiao, more

2015 8th International Conference on Ubi-Media Computing (UMEDIA) > 110 - 115

2015 8th International Conference on Ubi-Media Computing (UMEDIA)

Mining based on opinions can extract useful information from users' comments. After doing cluster and analysis on the information, users can get a detailed understanding of the commodity, then determine to buy the commodity or not. In this paper, firstly, we extract evaluation objects and evaluation words, then cluster the evaluation objects. Next based on SO-PMI algorithm, judge the polarity of evaluation...

chapter

k-Means Performance Improvements with Centroid Calculation Heuristics Both for Serial and Parallel Environments

Jeyhun Karimov, Murat Ozbayoglu, Erdogan Dogdu

2015 IEEE International Congress on Big Data > 444 - 451

2015 IEEE International Congress on Big Data (BigData Congress)

K-means is the most widely used clustering algorithm due to its fairly straightforward implementations in various problems. Meanwhile, when the number of clusters increase, the number of iterations also tend to slightly increase. However there are still opportunities for improvement as some studies in the literature indicate. In this study, improved implementations of k-means algorithm with a centroid...

chapter

Comparative study of cluster validity techniques using K-mediod algorithm

Romana Riyaz, Mohd Arif Wani

2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) > 893 - 898

2015 2nd International Conference on "Computing for Sustainable Global Development" (INDIACom)

The most important task of clustering process is the validation of results obtained from clustering algorithms. There are many cluster validation criteria's but the most commonly used approaches are founded on internal validity indices. There are numerous indices that have been suggested from time to time but there are only some of them that have been popularly used. In this paper we have drawn a...

chapter

A Modified Fuzzy C Means Clustering Using Neutrosophic Logic

Nadeem Akhtar, Mohd Vasim Ahmad

2015 Fifth International Conference on Communication Systems and Network Technologies > 1124 - 1128

2015 Fifth International Conference on Communication Systems and Network Technologies (CSNT)

A cluster can be defined as the collection of data objects grouped into the same group which are similar to each other whereas data objects which are different are grouped into different groups. The process of grouping a set objects into classes of similar objects is called clustering. In fuzzy c means clustering, every data point belongs to every cluster by some membership value. Hence, every cluster...

chapter

Cluster Analysis and Artificial Neural Networks: A Case Study in Credit Card Fraud Detection

Emanuel Mineda Carneiro, Luiz Alberto Vieira Dias, Adilson Marques da Cunha, Lineu Fernando Stege Mialaret

2015 12th International Conference on Information Technology - New Generations > 122 - 126

2015 12th International Conference on Information Technology - New Generations (ITNG)

Data normalization for use in Artificial Neural Networks often requires extensive statistical analysis. This paper presents an initial investigation of a case study involving credit card fraud detection, where Cluster Analysis was applied to data normalization. Early results obtained from the use of Artificial Neural Networks and Cluster Analysis on fraud detection has shown that neuronal inputs can...

Data set:
ieee
Keywords:
CLUSTERING ALGORITHMS
COMPUTERS
DATA MINING
Publication type:
book

Publication date

Set your own date range

Content availability

Available (127)
None (1)

Keywords

ALGORITHM DESIGN AND ANALYSIS (69)
PATTERN CLUSTERING (36)
CLASSIFICATION ALGORITHMS (28)
COMPUTATIONAL MODELING (28)
FEATURE EXTRACTION (26)
EDUCATIONAL INSTITUTIONS (25)
SIGNAL PROCESSING (25)
ACCURACY (23)
DATA MODELS (23)
CLUSTERING (22)
SIGNAL PROCESSING ALGORITHMS (22)
TRAINING (19)
NOISE (18)
PATTERN RECOGNITION (17)
DATABASES (16)
ARTIFICIAL NEURAL NETWORKS (15)
CONFERENCES (15)
PARTITIONING ALGORITHMS (15)
ROBUSTNESS (15)
INTERNET (14)
SHAPE (14)
CLUSTERING METHODS (13)
COMPLEXITY THEORY (13)
HEURISTIC ALGORITHMS (13)
PRESSES (13)
IMAGE PROCESSING (12)
TRANSFORMS (12)
GEOMETRY (11)
IMAGE SEGMENTATION (11)
COMPUTER SCIENCE (10)
IMAGE COLOR ANALYSIS (10)
SOFTWARE (10)
ANALYTICAL MODELS (9)
COMPUTER VISION (9)
DATA ANALYSIS (9)
FILTERING ALGORITHMS (9)
K-MEANS (9)
MACHINE LEARNING (9)
REAL TIME SYSTEMS (9)
TRAINING DATA (9)
APPROXIMATION METHODS (8)
ARTIFICIAL INTELLIGENCE (8)
ESTIMATION (8)
IMAGE RECONSTRUCTION (8)
MONITORING (8)
PARALLEL PROCESSING (8)
TESTING (8)
APPROXIMATION ALGORITHMS (7)
ASSOCIATION RULES (7)
CONVERGENCE (7)
ENTROPY (7)
FUZZY SET THEORY (7)
IMAGE ANALYSIS (7)
IMAGE EDGE DETECTION (7)
IMAGE RECOGNITION (7)
INFORMATION TECHNOLOGY (7)
INTRUSION DETECTION (7)
LIGHTING (7)
MATHEMATICAL MODEL (7)
MERGING (7)
METEOROLOGY (7)
OBJECT RECOGNITION (7)
SECURITY (7)
SECURITY OF DATA (7)
SUPPORT VECTOR MACHINE CLASSIFICATION (7)
SUPPORT VECTOR MACHINES (7)
ADAPTATION MODEL (6)
COMPUTER GRAPHICS (6)
EDUCATION (6)
ELECTRONIC MAIL (6)
EQUATIONS (6)
GAIN (6)
GRAPHICS (6)
IMAGE RESOLUTION (6)
MACHINE LEARNING ALGORITHMS (6)
ORGANIZATIONS (6)
PATTERN ANALYSIS (6)
PATTERN CLASSIFICATION (6)
SEARCH PROBLEMS (6)
SENSORS (6)
SOFTWARE ALGORITHMS (6)
SOLID MODELING (6)
SPATIAL DATABASES (6)
STATISTICAL ANALYSIS (6)
SURFACE TREATMENT (6)
TEXT ANALYSIS (6)
USA COUNCILS (6)
ANOMALY DETECTION (5)
ARRAYS (5)
BANDWIDTH (5)
BIOINFORMATICS (5)
BUILDINGS (5)
CLOUD COMPUTING (5)
COMPUTATIONAL EFFICIENCY (5)
COMPUTER ARCHITECTURE (5)
CONVOLUTION (5)
COVARIANCE MATRIX (5)
more

INFONA - science communication portal

Search results

Clustering of College Students Based on Improved K-Means Algorithm

MLP-based undersampling technique for imbalanced learning

Iterative sparse matrix-vector multiplication on in-memory cluster computing accelerated by GPUs for big data

Empirical analysis and improvement of density based clustering algorithm in data streams

SaFe-NeC: A scalable and flexible system for network data characterization

MapReduce Model of Improved K-Means Clustering Algorithm Using Hadoop MapReduce

A comparative study of K-Means, DBSCAN and OPTICS

A survey on online Stock forum using subspace clustering

DDBSCAN: Different Densities-Based Spatial Clustering of Applications with Noise

A hybrid outlier detection algorithm based on partitioning clustering and density measures

Clustering on Big Data Using Hadoop MapReduce

Application of clustering algorithm on TV programmes preference grouping of subscribers

Density K-means: A new algorithm for centers initialization for K-means

Click based inferring of user search goals using pseudo document

Analysis and evaluation of outlier detection algorithms in data streams

Cluster analysis based on opinion mining

k-Means Performance Improvements with Centroid Calculation Heuristics Both for Serial and Parallel Environments

Comparative study of cluster validity techniques using K-mediod algorithm

A Modified Fuzzy C Means Clustering Using Neutrosophic Logic

Cluster Analysis and Artificial Neural Networks: A Case Study in Credit Card Fraud Detection

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options