Search results

Items from 1 to 11 out of 11 results

chapter

Application of meteorological big data

Xi Guo

2016 16th International Symposium on Communications and Information Technologies (ISCIT) > 273 - 279

2016 16th International Symposium on Communications and Information Technologies (ISCIT)

The abundant aspects of big data and it's technology are increasing due to new methods of fetching data and diverse needs. Meteorological data is also the source of big data in terms of volume, variety, veracity and velocity, and it includes structured, unstructured and hybrid forms. This paper aims to apply Hadoop architecture and MapReduce algorithm into meteorological big data. It also describes...

chapter

Clustering on Big Data Using Hadoop MapReduce

Nadeem Akthar, Mohd Vasim Ahamad, Shahbaz Khan

2015 International Conference on Computational Intelligence and Communication Networks (CICN) > 789 - 795

2015 International Conference on Computational Intelligence and Communication Networks (CICN)

With the phenomenal increase in digital data, it is inefficient to run the traditional clustering algorithms on separate servers. To deal with this problem, researchers are migrating to distribute environment to implement the traditional clustering algorithms, more specifically K-means clustering. In traditional K Means Clustering, the problem of instability caused by the random initial centers exists...

chapter

Exploitation of Hadoop framework for point cloud geographic data storage system

Vladimir Hanusniak, Marian Svalec, Juraj Branicky, Lubos Takac, more

2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC) > 197 - 200

2015 Fifth International Conference on Digital Information Processing and Communications (ICDIPC)

It has been planned that the whole region of Slovak Republic's surface would be scanned, and there arose a need for storing the resulting data and making it publicly available. For this purpose, a scalable file-based database system for storing and accessing a large amount of geographic point cloud data was developed. The principle of the system was tested and proved to be sufficient in most situations,...

chapter

k-Means Performance Improvements with Centroid Calculation Heuristics Both for Serial and Parallel Environments

Jeyhun Karimov, Murat Ozbayoglu, Erdogan Dogdu

2015 IEEE International Congress on Big Data > 444 - 451

2015 IEEE International Congress on Big Data (BigData Congress)

K-means is the most widely used clustering algorithm due to its fairly straightforward implementations in various problems. Meanwhile, when the number of clusters increase, the number of iterations also tend to slightly increase. However there are still opportunities for improvement as some studies in the literature indicate. In this study, improved implementations of k-means algorithm with a centroid...

chapter

A Parallel Method for Rough Entropy Computation Using MapReduce

Si-Yuan Jing, Jin Yang, Kun She

2014 Tenth International Conference on Computational Intelligence and Security > 707 - 710

2014 Tenth International Conference on Computational Intelligence and Security (CIS)

Rough set theory has been proven to be a successful computational intelligence tool. Rough entropy is a basic concept in rough set theory and it is usually used to measure the roughness of information set. Existing algorithms can only deal with small data set. Therefore, this paper proposes a method for parallel computation of entropy using MapReduce, which is hot in big data mining. Moreover, corresponding...

chapter

Signature based malware detection for unstructured data in Hadoop

Abhaya Kumar Sahoo, Kshira Sagar Sahoo, Mayank Tiwary

2014 International Conference on Advances in Electronics Computers and Communications > 1 - 6

2014 International Conference on Advances in Electronics, Computers and Communications (ICAECC)

Hadoop is a very efficient distributed processing framework. It's based on map-reduce approach where the application is divided into small fragments of work, each of which may be executed on any node in the cluster. Hadoop is very efficient tool in storing and processing unstructured, semi-structured and structured data. Unstructured data usually refers to the data stored in files not in traditional...

chapter

Research of Data Mining in Cloud Environment

Hongbo Yu, Yihua Lan, Xingang Zhang, Zhidu Liu, more

2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics > 2 > 97 - 100

2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

This article first introduce the Core Architecture and operational mechanism of cloud computing and HADOOP platform, then put forward the technical architecture of data mining platform Based on HADOOP. After a thorough understanding to the Map Reduce programming pattern, HSPRINT algorithm is realized in the decision tree. At last the effectiveness of the algorithm is verified through experiments.

chapter

PSCAN: A Parallel Structural Clustering Algorithm for Big Networks in MapReduce

Weizhong Zhao, V. Martha, Xiaowei Xu

2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA) > 862 - 869

2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA)

Big data such as complex networks with over millions of vertices and edges is infeasible to process using conventional computation. MapReduce is a programming model that empowers us to analyze big data in a cluster of computers. In this paper we propose a Parallel Structural Clustering Algorithm for big Networks (PSCAN) in MapReduce for the detection of clusters or community structures in big networks...

chapter

HadoopRsync

Jiaran Zhang, Xiaohui Yu, You Li, Liwei Lin

2011 International Conference on Cloud and Service Computing > 166 - 173

2011 International Conference on Cloud and Service Computing (CSC)

Cloud storage has become increasingly popular due to its convenience, cost-effectiveness and scalability. It provides the basis for a slate of file hosting services, which offer users the ability to synchronize their files between the servers and their devices. Naive file synchronization, however, requires the whole file to be transmitted to all other locations (servers, devices) whenever the file...

chapter

New improvement of the Hadoop relevant data locality scheduling algorithm based on LATE

Liying Li, Zhuo Tang, Renfa Li, Liu Yang

2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC) > 1419 - 1422

2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC)

In the present, scheduling problem is a hot Cloud Computation research issues, the purpose is to coordinate the Cloud Computation resources to be fully rational use. Data locality is one of the main properties in the particular cloud platform for Hadoop. The paper discussed the property, proposed a new improvement of the Hadoop relevant data locality scheduling algorithm based on LATE. The algorithm...

chapter

Distributed log information processing with Map-Reduce: A case study from raw data to final models

Mingyue Luo, Gang Liu

2010 IEEE International Conference on Information Theory and Information Security > 1143 - 1146

2010 IEEE International Conference on Information Theory and Information Security

With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...

Filter options

Data set:
ieee
Keywords:
CLUSTERING ALGORITHMS
COMPUTERS
HADOOP
Publication type:
book

Publication date

Set your own date range

Keywords

MAPREDUCE (6)
BIG DATA (5)
DATA MINING (5)
COMPUTATIONAL MODELING (3)
DATA MODELS (3)
ALGORITHM DESIGN AND ANALYSIS (2)
CLOUD COMPUTING (2)
CLUSTERING (2)
DISTRIBUTED COMPUTING (2)
DISTRIBUTED DATABASES (2)
FILE SYSTEMS (2)
MAP-REDUCE (2)
SERVERS (2)
ACCURACY (1)
AGRICULTURE (1)
BANDWIDTH (1)
BENCHMARK TESTING (1)
BUSINESS (1)
CLOUD STORAGE (1)
CLUSTER (1)
CLUSTERING ALGORITHM (1)
COMMUNITIES (1)
COMMUNITY STRUCTURES (1)
COMPLEXITY THEORY (1)
DATA ANALYSIS (1)
DATA EXTRACTION (1)
DATA PRE-PROCESSING (1)
DATE LOCALITY (1)
DISTRIBUTED DATA MINING (1)
DISTRIBUTED LOG INFORMATION PROCESSING (1)
DISTRIBUTED PROCESSING (1)
E-COMMERCE WEBSITE (1)
ELECTRONIC COMMERCE (1)
ENCODING (1)
ENTROPY (1)
FUZZY LOGIC (1)
GEOGRAPHIC POINT CLOUD DATA (1)
HSPRINT ALGORITHM (1)
INDEXES (1)
INDUSTRIES (1)
INFORMATION ENTROPY (1)
INFORMATION TECHNOLOGY (1)
INTERNET (1)
JOIN OPERATION (1)
K-MEANS (1)
K-MEANS CLUSTERING (1)
LARGE DATA PROCESSING (1)
LATE (1)
LOG DATASET (1)
MALWARE (1)
MALWARES (1)
METEOROLOGY (1)
NETWORK CLUSTERING ALGORITHMS (1)
OPEN-SOURCE SOFTWARE (1)
PARALLEL ALGORITHMS (1)
PARTITIONING ALGORITHMS (1)
PATTERN CLUSTERING (1)
PATTERN MATCHING (1)
PLANNING (1)
POINT CLOUD (1)
POWER GRID (1)
PROCESSOR SCHEDULING (1)
PUBLIC DOMAIN SOFTWARE (1)
RANDOM ACCESS MEMORY (1)
RAW DATA PROCESSING (1)
ROUGH SET THEORY (1)
RSYNC (1)
SCHEDULING (1)
SELF-ORGANISING FEATURE MAPS (1)
SELF-ORGANIZED MAP (1)
SET THEORY (1)
SIGNATURES (1)
STANDARDS (1)
STORAGE SYSTEM (1)
SUM OPERATION (1)
SYNCHRONIZATION (1)
TESTING (1)
TEXT CLASSIFICATION (1)
THREE-DIMENSIONAL DISPLAYS (1)
THROUGHPUT (1)
TIME FACTORS (1)
TOURISM (1)
TRAINING (1)
TRANSPORTATION (1)
TWITTER (1)
UNSUPERVISED LEARNING (1)
USER INTEREST DETECTION (1)
WEB SITES (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options