2017 IEEE International Conference on Cluster Computing (CLUSTER)

chapter

HPC-Oriented Toolchain for Hardware Simulators

Olivier Serres, Engin Kayraklioglu, Tarek El-Ghazawi

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 653 - 654

Hardware design is an essential part of research in high performance computing. Initial efforts in hardware research consist of analyzing the design ideas in a software simulator. This allows chip designers to minimize amount of manufacturing that would be too costly and to avoid doing FPGA designs which are even more time consuming. Simulating a hardware design involves running many tests that try...

chapter

Introducing Weirs: An Abstraction for Next Generation Streaming Workflows

Erich Lohrmann, Greg Eisenhauer, Matthew Wolf

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 655 - 656

2017 IEEE International Conference on Cluster Computing (CLUSTER)

In HPC applications, it is widely understood that in situ systems will play a significant role in next generation systems. The rate that next-generation leadership machines will be able to generate data will exceed the bandwidths of the planned I/O systems, leading to a need for in situ processing of the resulting data to reduce it. There have been a number of techniques proposed for in situ workflow...

chapter

A Comparison of Parallel Graph Processing Implementations

Samuel D. Pollard, Boyana Norris

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 657 - 658

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The rapidly growing number of large network analysis problems has led to the emergence of many parallel and distributed graph processing systems—one survey in 2014 identified over 80. Determining the best approach for a given problem is infeasible for most developers. We present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries...

chapter

Co-locating Graph Analytics and HPC Applications

Kevin Brown, Satoshi Matsuoka

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 659 - 660

2017 IEEE International Conference on Cluster Computing (CLUSTER)

We evaluate the on-node interference caused when co-locating traditional high-performance computing applications with a big-data application. Using kernel benchmarks from the NPB suite and a state-of-art graph analytics code, we explore different process placements and effects they have on application performance. Our results show that the most memory intensive HPC application (MG) experienced the...

chapter

Accelerating Smith-Waterman Alignment Workload with Scalable Vector Computing

Dong-Hyeon Park, Jonathan Beaumont, Trevor Mudge

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 661 - 668

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Recent breakthroughs in DNA sequencing opened up new avenues for bioinformatics, and we have seen increasing demand to make such advanced biomedical technologies cheaper and more accessible. Sequence alignment, the process of matching two gene fragments, is a major bottleneck in Whole Genome Sequencing (WGS). We explored the potential of accelerating Smith-Waterman sequence alignment algorithm through...

chapter

Halide Vectorization for Android Photography Applications — A Case Study

Martin Johnson, Daniel Playne

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 669 - 676

2017 IEEE International Conference on Cluster Computing (CLUSTER)

We evaluate the vector performance of the Halide domain-specific language for a computational photography application targeted at Android devices. Our application has existing implementations in C++ and ARM NEON and these are used as a baseline for performance comparisons with Halide. We give a very brief introduction to Halide concepts and describe the structure of our application. We describe how...

chapter

Preliminary Performance Evaluation of Application Kernels Using ARM SVE with Multiple Vector Lengths

Yuetsu Kodama, Tetsuya Odajima, Motohiko Matsuda, Miwako Tsuji, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 677 - 684

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Modern high performance processors are equipped with very wide SIMD instruction set. SVE (Scalable Vector Extension) is an ARM® SIMD technology that supports vector lengths from 128 bits to 2048 bits. One of its promising features is to offer "vector-length agnostic" programming to allow the same SVE code to run on hardware of any vector length without any modification of the code. This...

chapter

Vectorization-Aware Loop Optimization with User-Defined Code Transformations

Hiroyuki Takizawa, Thorsten Reimann, Kazuhiko Komatsu, Takashi Soga, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 685 - 692

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The cost of maintaining an application code would significantly increase if the application code is branched into multiple versions, each of which is optimized for a different architecture. In this work, default and vector versions of a realworld application code are refactored to be a single version, and the differences between the versions are expressed as userdefined code transformations. As a...

chapter

Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs

Ryusuke Egawa, Kazuhiko Komatsu, Yoko Isobe, Toshihiro Kato, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 693 - 700

2017 IEEE International Conference on Cluster Computing (CLUSTER)

As the SIMD width of modern microprocessors has been widening for keeping up with the computational demand for HPC systems, recently the vector architecture comes back to spotlight. Besides, a modern vector architecture that has been keeping a large SIMD width and a high B/F ratio has survived and evolved in the HPC community. In this paper, to clarify the potential of the modern vector architecture,...

chapter

Performance Evaluation of Quantum ESPRESSO on NEC SX-ACE

Osamu Watanabe, Akihiro Musa, Hiroaki Hokari, Shivanshu Singh, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 701 - 708

2017 IEEE International Conference on Cluster Computing (CLUSTER)

In recent years, a lot of computer simulation codes have been developed as open-source software. Meanwhile major processors adopt a concept of a vector processing in high performance computing. Hence, the computer simulation codes need to follow a vector processing manner to have a benefit of a computational potential of the vector processing. Our study is evaluation and analysis of performance of...

chapter

Assuming Failure Independence: Are We Right to be Wrong?

Guillaume Aupy, Yves Robert, Frederic Vivien

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 709 - 716

2017 IEEE International Conference on Cluster Computing (CLUSTER)

This paper revisits the failure1 temporal independence hypothesis which is omnipresent in the analysis of resilience methods for HPC. We explain why a previous approach is incorrect, and we propose a new method to detect failure cascades, i.e., series of non-independent consecutive failures. We use this new method to assess whether public archive failure logs contain failure cascades. Then we design...

chapter

MACORD: Online Adaptive Machine Learning Framework for Silent Error Detection

Omer Subasi, Sheng Di, Prasanna Balaprakash, Osman Unsal, more

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 717 - 724

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Future high-performance computing (HPC) systems with ever-increasing resource capacity (such as compute cores, memory and storage) may significantly increase the risks on reliability. Silent data corruptions (SDCs) or silent errors are among the major sources that corrupt HPC execution results. Unlike fail-stop errors, SDCs can be harmful and dangerous in that they cannot be detected by hardware....

chapter

cudaCR: An In-Kernel Application-Level Checkpoint/Restart Scheme for CUDA-Enabled GPUs

Behnam Pourghassemi, Aparna Chandramowlishwaran

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 725 - 732

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Fault-tolerance is becoming increasingly important as we enter the era of exascale computing. Increasing the number of cores results in a smaller mean time between failures, and consequently, higher probability of errors. Among the different software fault tolerance techniques, checkpoint/restart is the most commonly used method in supercomputers, the de-facto standard for large-scale systems. Although...

chapter

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Grzegorz Pawelczak, Simon McIntosh-Smith, James Price, Matt Martineau

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 733 - 740

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The continuous growth of high-performance computing (HPC) systems has lead to Fault Tolerance (FT) being identified as one of the major challenges for exascale computing, due to the expected decrease in Mean Time Between Failures (MTBF). One source of faults are soft errors, which can cause bit corruptions to the data held in memory. Current solutions for protection against these errors include hardware...

chapter

Performance Implications of Failures on MapReduce Applications

Mohammad Tanvir Rahman, Edgar Gabriel, Jaspal Subhlok

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 741 - 748

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Due to the growing size of compute clusters, large scale parallel applications increasingly have to deal with hardware malfunctions and other failure scenarios during execution. The overall goal of this research is to get good performance of MapReduce applications despite failures. The paper focuses on evaluation of the performance of two representative Hadoop MapReduce applications, 'WordCount' and...

chapter

A Malleable and Fault-Tolerant Task Pool Framework for X10

Marco Bungart, Claudia Fohry

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 749 - 757

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Current HPC environments require parallel programs that are both malleable and fault-tolerant. Malleability denotes the ability to embrace system-initiated resource changes, and fault tolerance denotes the ability to cope with, e.g., permanent node failures.This paper considers the task pool pattern, specifically its lifeline-based variant. It builds on a previous fault-tolerant realization, and integrates...

chapter

Big Data Meets HPC Log Analytics: Scalable Approach to Understanding Systems at Extreme Scale

Byung H. Park, Saurabh Hukerikar, Ryan Adamson, Christian Engelmann

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 758 - 765

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Today's high-performance computing (HPC) systems are heavily instrumented, generating logs containing information about abnormal events, such as critical conditions, faults, errors and failures, system resource utilization, and about the resource usage of user applications. These logs, once fully analyzed and correlated, can produce detailed information about the system health, root causes of failures,...

chapter

Data Mining-Based Analysis of HPC Center Operations

Jannis Klinkenberg, Christian Terboven, Stefan Lankes, Matthias S. Muller

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 766 - 773

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Size and complexity of contemporary High Performance Computing (HPC) systems increases permanently. While the reliability of a single component and compute node is high, the huge amount of components comprising these systems results in the fact that defects happen regularly. This drives the need to manage failure situations. Common issues are component failures or node soft lock-ups that typically...

chapter

Job Storage Performance Monitoring on Sonexion with Project Caribou

Nathan Schumann, Craig Flaskerud

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 774 - 780

2017 IEEE International Conference on Cluster Computing (CLUSTER)

This paper discusses the motivation and implementation for Cray's Project Caribou. Project Caribou enables users to correlate HPC job performance with Lustre file systems through collected metrics and events. We will discuss use cases, the sources of metrics that are collected, correlation, and how the data is visualized. Additional topics to include events and alerts that are available, as well as...

chapter

LIKWID Monitoring Stack: A Flexible Framework Enabling Job Specific Performance monitoring for the masses

Thomas Rohl, Jan Eitzinger, Georg Hager, Gerhard Wellein

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 781 - 784

2017 IEEE International Conference on Cluster Computing (CLUSTER)

System monitoring is an established tool to measure the utilization and health of HPC systems. Usually system monitoring infrastructures make no connection to job information and do not utilize hardware performance monitoring (HPM) data. To increase the efficient use of HPC systems automatic and continuous performance monitoring of jobs is an essential component. It can help to identify pathological...

INFONA - science communication portal

2017 IEEE International Conference on Cluster Computing (CLUSTER)

HPC-Oriented Toolchain for Hardware Simulators

Introducing Weirs: An Abstraction for Next Generation Streaming Workflows

A Comparison of Parallel Graph Processing Implementations

Co-locating Graph Analytics and HPC Applications

Accelerating Smith-Waterman Alignment Workload with Scalable Vector Computing

Halide Vectorization for Android Photography Applications — A Case Study

Preliminary Performance Evaluation of Application Kernels Using ARM SVE with Multiple Vector Lengths

Vectorization-Aware Loop Optimization with User-Defined Code Transformations

Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs

Performance Evaluation of Quantum ESPRESSO on NEC SX-ACE

Assuming Failure Independence: Are We Right to be Wrong?

MACORD: Online Adaptive Machine Learning Framework for Silent Error Detection

cudaCR: An In-Kernel Application-Level Checkpoint/Restart Scheme for CUDA-Enabled GPUs

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Performance Implications of Failures on MapReduce Applications

A Malleable and Fault-Tolerant Task Pool Framework for X10

Big Data Meets HPC Log Analytics: Scalable Approach to Understanding Systems at Extreme Scale

Data Mining-Based Analysis of HPC Center Operations

Job Storage Performance Monitoring on Sonexion with Project Caribou

LIKWID Monitoring Stack: A Flexible Framework Enabling Job Specific Performance monitoring for the masses

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Cluster Computing (CLUSTER) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Cluster Computing (CLUSTER)