2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

One of the most important aspects that influences the performance of parallel applications is the speed of communication between their tasks. To optimize communication, tasks that exchange lots of data should be mapped to processing units that have a high network performance. This technique is called communication-aware task mapping and requires detailed information about the underlying network topology...

chapter

Tyrex: Size-Based Resource Allocation in MapReduce Frameworks

Bogdan Ghit, Dick Epema

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 11 - 20

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Many large-scale data analytics infrastructures are employed for a wide variety of jobs, ranging from short interactive queries to large data analysis jobs that may take hours or even days to complete. As a consequence, data-processing frameworks like MapReduce may have workloads consisting of jobs with heavy-tailed processing requirements. With such workloads, short jobs may experience slowdowns...

chapter

Demand-Aware Power Management for Power-Constrained HPC Systems

Thang Cao, Yuan He, Masaaki Kondo

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 21 - 31

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

As limited power budget is becoming one of the most crucialchallenges in developing supercomputer systems, hardware overprovisioning which installs larger number of nodes beyond the limitations of the power constraint determinedby Thermal Design Power is an attractive way to design extreme-scale supercomputers. In this design, power consumption of each node should be controlled by power-knobs equipped...

chapter

Landrush: Rethinking In-Situ Analysis for GPGPU Workflows

Anshuman Goswami, Yuan Tian, Karsten Schwan, Fang Zheng, more

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 32 - 41

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

In-situ analysis on the output data of scientific simulations has been made necessary by ever-growing output data volumes and increasing costs of data movement as supercomputing is moving towards exascale. With hardware accelerators like GPUs becoming increasingly common in high end machines, new opportunities arise to co-locate scientific simulations and online analysis performed on the scientific...

chapter

Service Level and Performance Aware Dynamic Resource Allocation in Overbooked Data Centers

Luis Tomas, Ewnetu Bayuh Lakew, Erik Elmroth

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 42 - 51

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Many cloud computing providers use overbookingto increase their low utilization ratios. This however increases therisk of performance degradation due to interference among co-located VMs. To address this problem we present a service leveland performance aware controller that: (1) provides performanceisolation for high QoS VMs, and (2) reduces the VM interferencebetween low QoS VMs by dynamically mapping...

chapter

DieHard: Reliable Scheduling to Survive Correlated Failures in Cloud Data Centers

Mina Sedaghat, Eddie Wadbro, John Wilkes, Sara De Luna, more

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 52 - 59

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

In large scale data centers, a single fault can lead to correlated failures of several physical machines and the tasks running on them, simultaneously. Such correlated failures can severely damage the reliability of a service or a job. This paper models the impact of stochastic and correlated failures on job reliability in a data center. We focus on correlated failures caused by power outages or failures...

chapter

SHMEMPMI -- Shared Memory Based PMI for Improved Performance and Scalability

Sourav Chakraborty, Hari Subramoni, Jonathan Perkins, Dhabaleswar K. Panda

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 60 - 69

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Dense systems with large number of cores per node are becoming increasinglypopular. Existing designs of the Process Management Interface (PMI) show poorscalability in terms of performance and memory consumption on such systems withlarge number of processes concurrently accessing the PMI interface. Ouranalysis shows the local socket-based communication scheme used by PMI to be amajor bottleneck. While...

chapter

DiBA: Distributed Power Budget Allocation for Large-Scale Computing Clusters

Masoud Badiei, Xin Zhan, Reza Azimi, Sherief Reda, more

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 70 - 79

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Power management has become a central issue inlarge-scale computing clusters where a considerable amount ofenergy is consumed and a large operational cost is incurredannually. Traditional power management techniques have a centralizeddesign that creates challenges for scalability of computingclusters. In this work, we develop a framework for distributedpower budget allocation that maximizes the utility...

chapter

KOALA-F: A Resource Manager for Scheduling Frameworks in Clusters

Aleksandra Kuzmanovska, Rudolf H. Mak, Dick Epema

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) > 80 - 89

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Due to the diversity in the applications that run in clusters, many different application frameworks have been developed, such as MapReduce for data-intensive batch jobs and Spark for interactive data analytics. A framework is first deployed in a cluster, and then starts executing a large set of jobs that are submitted over time. When multiple such frameworks with time-varying resource demands are...

Publication date

Set your own date range

Keywords

CLOUD COMPUTING (17)
SCHEDULING (6)
BIG DATA (5)
CLOUD (5)
HIGH PERFORMANCE COMPUTING (5)
HPC (5)
GPU (4)
PERFORMANCE EVALUATION (4)
DATA CENTERS (3)
DISTRIBUTED SYSTEMS (3)
MAPREDUCE (3)
PERFORMANCE (3)
AMAZON WEB SERVICES (2)
APPLICATIONS (2)
BIOINFORMATICS (2)
CHECKPOINTING (2)
CLOUD SERVICES (2)
CUDA (2)
DISTRIBUTED PROCESSING (2)
DOCKER (2)
ELASTICITY (2)
FAULT TOLERANCE (2)
FAULT-TOLERANCE (2)
GRAPH PROCESSING (2)
HADOOP (2)
LOAD BALANCING (2)
MACHINE LEARNING (2)
MIDDLEWARE (2)
MPI (2)
MULTI-THREADING (2)
OBJECT STORAGE (2)
PARALLEL I/O (2)
PERFORMANCE INTERFERENCE (2)
PERFORMANCE MODELING (2)
PERFORMANCE OPTIMIZATION (2)
PERFORMANCE PREDICTION (2)
PUBLIC CLOUDS (2)
RELIABILITY (2)
RESOURCE PROVISIONING (2)
SCIENTIFIC WORKFLOWS (2)
SIMULATION (2)
SLURM (2)
TIERED STORAGE (2)
VOLUNTEER COMPUTING (2)
60GHZ WIRELESS (1)
ADAPTATION (1)
ADAPTIVE POWER MANAGEMENT (1)
AIR TRAVEL (1)
ALGEBRAIC RECONSTRUCTION TECHNIQUE (1)
ALGORITHMS FOR ACCELERATORS AND HETEROGENEOUS SYSTEMS (1)
ANALYTICAL MODEL (1)
ANDROID (1)
ANGULARJS (1)
ANONYMITY (1)
APPLICATION PLACEMENT (1)
ASTROPHYSICS (1)
ATMOSPHERIC SIMULATION (1)
AWS LAMBDA (1)
BENCHMARK SUITE (1)
BIG DATA ANALYTICS (1)
BIG DATA MANAGEMENT (1)
BIG GRAPH ANALYTICS (1)
BILLING (1)
BLOCK INDEX AND QUERY (1)
CACHE AFFINITY (1)
CACHE LOCALITY (1)
CACHE REPLACEMENT ALGORITHM (1)
CLOUD BENCHMARKING (1)
CLOUD DATABASE (1)
CLOUD PROVISIONING (1)
CLOUD SECURITY (1)
CLOUDS (1)
CLUSTER (1)
CODE TRANSFORMATION (1)
COLLECTIVE I/O (1)
COLLECTIVE OPERATIONS (1)
COMBINATORIAL AND DATA IN-TENSIVE APPLICATION (1)
COMBUSTION (1)
COMMUNICATION (1)
COMPACTION (1)
COMPOSED APPLICATIONS (1)
COMPOSITION (1)
COMPUTED TOMOGRAPHY (1)
CONDITIONAL PRIVACY (1)
CONSENSUS ALGORITHM (1)
CONTAINERS (1)
CORRELATED FAILURES (1)
COSMIC RAYS; BIG DATA; CORSIKA; HPC; LAGO (1)
COST OPTIMAL CLUSTER COMPOSITION (1)
COST-EFFECTIVENESS (1)
COST-EFFICIENT PROCESSING (1)
CUDA-AWARE MPI (1)
CYBER-INFRASTRUCTURE (1)
DATA ACCESS PERFORMANCE (1)
DATA ANALYSIS (1)
DATA ANALYTICS (1)
DATA CENTER (1)
DATA LAYOUT REORGANIZATION (1)
DATA LOCALITY (1)
DATA PARALLEL (1)
more

INFONA - science communication portal

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Cover Art

Title Page i

Title Page iii

Copyright Page

Table of Contents

Message from the CCGrid 2016 General Chairs

Message from the CCGrid 2016 Program Chairs

Organizing Committee

List of Reviewers

Keynotes

Technical Program Committee

Automatic Communication Optimization of Parallel Applications in Public Clouds

Tyrex: Size-Based Resource Allocation in MapReduce Frameworks

Demand-Aware Power Management for Power-Constrained HPC Systems

Landrush: Rethinking In-Situ Analysis for GPGPU Workflows

Service Level and Performance Aware Dynamic Resource Allocation in Overbooked Data Centers

DieHard: Reliable Scheduling to Survive Correlated Failures in Cloud Data Centers

SHMEMPMI -- Shared Memory Based PMI for Improved Performance and Scalability

DiBA: Distributed Power Budget Allocation for Large-Scale Computing Clusters

KOALA-F: A Resource Manager for Scheduling Frameworks in Clusters

Filter options

Publication date

Keywords

INFONA - science communication portal

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)