Search results

Items from 41 to 60 out of 928 results

chapter

GPU in-Memory Processing Using Spark for Iterative Computation

Sumin Hong, Woohyuk Choi, Won-Ki Jeong

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 31 - 41

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Due to its simplicity and scalability, MapReduce has become a de facto standard computing model for big data processing. Since the original MapReduce model was only appropriate for embarrassingly parallel batch processing, many follow-up studies have focused on improving the efficiency and performance of the model. Spark follows one of these recent trends by providing in-memory processing capability...

chapter

Crowdsourced Data Integrity Verification for Key-Value Stores in the Cloud

Grisha Weintraub, Ehud Gudes

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 498 - 503

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Thanks to their high availability, scalability, and usability, cloud databases have become one of the dominant cloud services. However, since cloud users do not physically possess their data, data integrity may be at risk. In this paper, we present a novel protocol that utilizes crowdsourcing paradigm to provide practical data integrity assurance in key-value cloud databases. The main advantage of...

chapter

Multi-class active learning: A hybrid informative and representative criterion inspired approach

Zengmao Wang, Bo Du, Lefei Zhang

2017 International Joint Conference on Neural Networks (IJCNN) > 1510 - 1517

2017 International Joint Conference on Neural Networks (IJCNN)

Labeling each instance in a large-scale data set is extremely labor- and time-consuming. One way to alleviate this problem is active learning, which aims to discover the most valuable instances for labeling to construct a powerful classifier with low generalization error. Considering both informativeness and representativeness provides a promising way to design a practical active learning. However,...

chapter

Robotics data real-time management based on NoSQL solution

Afef Gueidi, Hamza Gharsellaoui, Samir Ben Ahmed

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 131 - 136

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

In nowadays, robotics database management systems are increasing. These systems ensure good storage of data and with big data analytic, a new approach demands new structures and methods for collecting, recording, and analyzing enterprise data. This paper work deals with the NoSQL databases which are the secret of the continual progression data that new data management solutions have been emerged....

chapter

Cooperative learning: Decentralized data neural network

Noah Lewis, Sergey Plis, Vince Calhoun

2017 International Joint Conference on Neural Networks (IJCNN) > 324 - 331

2017 International Joint Conference on Neural Networks (IJCNN)

Researchers often wish to study data stored in separate locations, such as when several research entities wish to make inferences from their combined data. The most common solution is to centralize the data in one location. However, certain types of data can be difficult to transfer between entities due to legal or practical reasons. This makes centralizing these types of data problematic. A possible...

chapter

Automated Dynamic Data Redistribution

Thomas Marrinan, Joseph A. Insley, Silvio Rizzi, Francois Tessier, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1208 - 1215

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

High-performance distributed memory applications often load or receive data in a format that differs from what the application uses. One such difference arises from how the application distributes data for parallel processing. Data must be redistributed from how it was laid out by the producer to how the application needs the data to be laid out amongst its processes. In this paper, we present a large-scale...

chapter

A group based genetic algorithm data replica placement strategy for scientific workflow

Lihui Liu, Ying Yang, Haibo Wang, Zhifei Tan, more

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 459 - 464

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

When running data intensive scientific workflow in multiple data centers environment, it is inevitable that massive data movement will be caused. The emergence of cloud computing technologies offers a new way to develop scientific workflow systems, and using dataset replicas to reduce data transfer among data centers is an import issue. In this paper, we propose a group based genetic algorithm which...

chapter

Two-step RDF query processing for Linked Data

Yongju Lee, Changsu Kim

2017 11th International Conference on Research Challenges in Information Science (RCIS) > 459 - 460

2017 11th International Conference on Research Challenges in Information Science (RCIS)

Since RDF triples are modeled as graphs, we cannot directly adopt existing solutions from relational databases and XML technologies. Thus, there are still a number of open problems in the area of Linked Data. We present a hybrid method between centralized and distributed approaches. By using auxiliary indexes based on the MBB approximation, our approach can retrieve distributed Linked Data efficiently...

article

Aggregating Uncertain Incast Transfers in BCube-Like Data Centers

Deke Guo

IEEE Transactions on Parallel and Distributed Systems > 2017 > 28 > 4 > 934 - 946

Many data-intensive applications like MapReduce are network-bound in data centers, due to transfer massive amount of flows across successive processing stages. Data flows in such an incast or shuffle transfer are highly correlated and aggregated at the receiver side. Prior work aims to aggregate correlated flows of each transfer, during the transmission phase as early as possible, so as to directly...

chapter

Big Data Analysis: Structuring of Data

Riya Ojha, Rakshit Singh, Aditya Singh

2017 International Conference on Technical Advancements in Computers and Communications (ICTACC) > 144 - 147

2017 International Conference on Technical Advancements in Computers and Communications (ICTACC)

In the emerging field of big data, a large volume of data has to be managed, operating on data of huge volume becomes easier when it's sorted and structured. The data can be structured using a simple algorithm i.e. index algorithm which stores and categories data on basis of their application. This in turn will be very beneficial on business level as well as on software level.

chapter

PROV-TE: A Provenance-Driven Diagnostic Framework for Task Eviction in Data Centers

Abdulaziz Albatli, David McKee, Paul Townend, Lydia Lau, more

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService) > 233 - 242

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService)

Cloud Computing allows users to control substantial computing power for complex data processing, generating huge and complex data. However, the virtual resources requested by users are rarely utilized to their full capacities. To mitigate this, providers often perform over-commitment to maximize profit, which can result in node overloading and consequent task eviction. This paper presents a novel...

chapter

Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters

An Qin, Yuan Yuan, Dai Tan, Pengyu Sun, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 1173 - 1182

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Fast data analytics at an increasingly large scale has become a critical task in any Internet service company. For example, in Baidu, the major search engine company in China, large volumes of Web and business data in PB-scale are timely and constantly acquired and analyzed for the purposes of evaluating product revenue, tracking product demanding activities on market, predicting user behavior, upgrading...

chapter

Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset

Nerijus Paulauskas, Juozas Auskalnis

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream) > 1 - 5

2017 Open Conference of Electrical, Electronic and Information Sciences (eStream)

Data pre-processing for machine learning methods is key step for knowledge discovery process. Depending on nature of the data, pre-processing might take the majority time of data analysis. Correctly prepared data for processing guarantees precise and reliable results of data analysis. This paper analyses initial data pre-processing influence to attack detection accuracy by using Decision Trees, Naïve...

chapter

2017 5th International Istanbul Smart Grid and Cities Congress and Fair (ICSG) > 2 - 3

2017 5th International Istanbul Smart Grid and Cities Congress and Fair (ICSG)

chapter

KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 535 - 546

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Modern advanced analytics applications make use of machine learning techniques and contain multiple steps of domain-specific and general-purpose processing with high resource requirements. We present KeystoneML, a system that captures and optimizes the end-to-end large-scale machine learning applications for high-throughput training in a distributed environment with a high-level API. This approach...

chapter

In-Memory Distributed Matrix Computation Processing and Optimization

Yongyang Yu, Mingjie Tang, Walid G. Aref, Qutaibah M. Malluhi, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 1047 - 1058

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

The use of large-scale machine learning and data mining methods is becoming ubiquitous in many application domains ranging from business intelligence and bioinformatics to self-driving cars. These methods heavily rely on matrix computations, and it is hence critical to make these computations scalable and efficient. These matrix computations are often complex and involve multiple steps that need to...

chapter

A Fined-Grained Privacy-Preserving Access Control Protocol in Wireless Sensor Networks

Jie Cui, Hong Zhong, Xuan Tang, Jing Zhang

2016 IEEE/ACM 9th International Conference on Utility and Cloud Computing (UCC) > 382 - 387

2016 IEEE/ACM 9th International Conference on Utility and Cloud Computing (UCC)

For single-owner multi-user wireless sensor networks, there is the demand to implement the user privacy-preserving access control protocol in WSNs. Firstly, we propose a new access control protocol based on an efficient attribute-based signature. In the protocol, users need to pay for query, and the protocol achieves fine-grained access control and privacy protection. Then, the protocol is analyzed...

chapter

Large-scale time series data down-sampling based on Map-Reduce programming mode

Jiajia Xu, Yichang Qiu, Haiying Zhang, Meng Li, more

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 409 - 413

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

In the last decades, more and more time series data has been collected in many kinds of fields, and specially in the industry field, which has been increased greatly. One of the most common types of data visualization used is the line chart, but in industry field, time series datasets are so huge that it costs much more time to draw data as a line chart. In this case, we must reduce dimensionality...

chapter

Some key problems of data management in army data engineering based on big data

Xiao HongJu, Wang Fei, Wang FenMei, Wang XiuZhen

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)( > 149 - 152

2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)

This paper analyzed the challenges of data management in army data engineering, such as big data volume, data heterogeneous, high rate of data generation and update, high time requirement of data processing, and widely separated data sources. We discussed the disadvantages of traditional data management technologies to deal with these problems. We also highlighted the key problems of data management...

chapter

MRSIM: Mitigating Reducer Skew In MapReduce

Lei Chen, Wei Lu, Xiaoping Che, Weiwei Xing, more

2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA) > 379 - 384

2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA)

MapReduce has emerged as a popular programming model in the field of data-intensive computing. This is due to its simplistic design, which provides ease of use for programmers, and its framework implementations such as Hadoop, which have been adopted by large business and technology companies. One significant issue in practical MapReduce applications is data skew: the imbalance in the amount of data...

Keywords:
DATA MODELS
DISTRIBUTED DATABASES

Publication date

Set your own date range

Content availability

Available (920)
None (8)

Publication type

book (816)
article (112)

Keywords

COMPUTATIONAL MODELING (217)
DATA MINING (141)
SERVERS (115)
CLOUD COMPUTING (108)
BIG DATA (79)
ALGORITHM DESIGN AND ANALYSIS (70)
ANALYTICAL MODELS (70)
COMPUTER ARCHITECTURE (61)
GRID COMPUTING (61)
INTERNET (60)
XML (54)
MONITORING (53)
DATABASES (51)
DISTRIBUTED PROCESSING (49)
MAPREDUCE (47)
INDEXES (46)
QUERY PROCESSING (46)
PROTOCOLS (43)
OPTIMIZATION (42)
WEB SERVICES (42)
BANDWIDTH (41)
MIDDLEWARE (40)
SCALABILITY (39)
COMPUTERS (38)
DATA INTEGRATION (37)
SOFTWARE (37)
BIOLOGICAL SYSTEM MODELING (36)
DATA PROCESSING (35)
GEOGRAPHIC INFORMATION SYSTEMS (35)
PREDICTIVE MODELS (35)
RELATIONAL DATABASES (35)
CLUSTERING ALGORITHMS (34)
ONTOLOGIES (34)
BUSINESS (33)
DATA ANALYSIS (33)
SPATIAL DATABASES (33)
WIRELESS SENSOR NETWORKS (33)
MATHEMATICAL MODEL (32)
RESOURCE MANAGEMENT (32)
SEMANTICS (32)
AVAILABILITY (30)
OBJECT ORIENTED MODELING (30)
ORGANIZATIONS (30)
PEER-TO-PEER COMPUTING (30)
DATA HANDLING (29)
MEMORY (29)
PROGRAMMING (29)
REAL-TIME SYSTEMS (29)
TRAINING (29)
HADOOP (28)
HEURISTIC ALGORITHMS (28)
DATA STRUCTURES (27)
DATA VISUALIZATION (27)
SECURITY (27)
DATA PRIVACY (26)
STANDARDS (26)
PARALLEL PROCESSING (25)
SCHEDULING (25)
ACCURACY (24)
COLLABORATION (24)
DISTRIBUTED COMPUTING (24)
LIBRARIES (24)
REAL TIME SYSTEMS (24)
SYNCHRONIZATION (24)
PEER TO PEER COMPUTING (23)
DATABASE SYSTEMS (22)
ESTIMATION (22)
LOAD MODELING (22)
RESOURCE DESCRIPTION FRAMEWORK (22)
DATA GRID (21)
CORRELATION (20)
DISTRIBUTED SYSTEMS (20)
META DATA (20)
SENSORS (20)
VECTORS (20)
CONTEXT (19)
EDUCATIONAL INSTITUTIONS (19)
FILE SYSTEMS (19)
ONTOLOGY (19)
REMOTE SENSING (19)
RUNTIME (19)
ARRAYS (18)
DATA MANAGEMENT (18)
MOBILE COMMUNICATION (18)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (18)
CONFERENCES (17)
DELAY (17)
DISTRIBUTED DATABASE (17)
ENGINES (17)
EQUATIONS (17)
NOSQL (17)
QUALITY OF SERVICE (17)
SECURITY OF DATA (17)
RELIABILITY (16)
TIME FACTORS (16)
UNIFIED MODELING LANGUAGE (16)
COMMUNITIES (15)
DATA COMMUNICATION (15)
more

Data set

ieee (927)
Springer (1)

INFONA - science communication portal

Search results

GPU in-Memory Processing Using Spark for Iterative Computation

Crowdsourced Data Integrity Verification for Key-Value Stores in the Cloud

Multi-class active learning: A hybrid informative and representative criterion inspired approach

Robotics data real-time management based on NoSQL solution

Cooperative learning: Decentralized data neural network

Automated Dynamic Data Redistribution

A group based genetic algorithm data replica placement strategy for scientific workflow

Two-step RDF query processing for Linked Data

Aggregating Uncertain Incast Transfers in BCube-Like Data Centers

Big Data Analysis: Structuring of Data

PROV-TE: A Provenance-Driven Diagnostic Framework for Task Eviction in Data Centers

Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters

Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset

Table of contents

KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

In-Memory Distributed Matrix Computation Processing and Optimization

A Fined-Grained Privacy-Preserving Access Control Protocol in Wireless Sensor Networks

Large-scale time series data down-sampling based on Map-Reduce programming mode

Some key problems of data management in army data engineering based on big data

MRSIM: Mitigating Reducer Skew In MapReduce

Filter options

Publication date

Content availability

Publication type

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options