SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

book

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

IEEE

chapter

A Parallel Algorithm for Finding All Pairs κ-Mismatch Maximal Common Substrings

Sriram P. Chockalingam, Sharma V. Thankachan, Srinivas Aluru

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 784 - 794

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

We present an efficient parallel algorithm for the following problem: Given an input collection D of n sequences of total length N, a length threshold f and a mismatch threshold κ, report all κ-mismatch maximal common substrings of length at least f over all pairs of strings in D. This problem is motivated by clustering and assembly applications in computational biology, where D is a collection of...

chapter

10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics

Chao Yang, Wei Xue, Haohuan Fu, Hongtao You, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 57 - 68

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

An ultra-scalable fully-implicit solver is developed for stiff time-dependent problems arising from the hyperbolic conservation laws in nonhydrostatic atmospheric dynamics. In the solver, we propose a highly efficient hybrid domain-decomposed multigrid preconditioner that can greatly accelerate the convergence rate at the extreme scale. For solving the overlapped subdomain problems, a geometry-based...

chapter

The Vectorization of the Tersoff Multi-body Potential: An Exercise in Performance Portability

Markus Hohnerbach, Ahmed E. Ismail, Ahmed E. Ismail

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 69 - 81

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Molecular dynamics simulations, an indispensable research tool in computational chemistry and materials science, consume a significant portion of the supercomputing cycles around the world. We focus on multi-body potentials and aim at achieving performance portability. Compared with well-studied pair potentials, multibody potentials deliver increased simulation accuracy but are too complex for effective...

chapter

Modeling Dilute Solutions Using First-Principles Molecular Dynamics: Computing more than a Million Atoms with over a Million Cores

Jean-Luc Fattebert, Daniel Osei-Kuffuor, Erik W. Draeger, Tadashi Ogitsu, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 12 - 22

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

First-Principles Molecular Dynamics (FPMD) methods, although powerful, are notoriously expensive computationally due to the quantum modeling of electrons. Traditional FPMD approaches have typically been limited to a few thousand atoms at most, due to O(N3) or worse solver complexity and the large amount of communication required for highly parallel implementations. Attempts to lower the complexity...

chapter

Scheduling-Aware Routing for Supercomputers

Jens Domke, Torsten Hoefler

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 142 - 153

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The interconnection network has a large influence on total cost, application performance, energy consumption, and overall system efficiency of a supercomputer. Unfortunately, today's routing algorithms do not utilize this important resource most efficiently. We first demonstrate this by defining the dark fiber metric as a measure of unused resource in networks. To improve the utilization, we propose...

chapter

A Multi-faceted Approach to Job Placement for Improved Performance on Extreme-Scale Systems

Christopher Zimmer, Saurabh Gupta, Scott Atchley, Sudharshan S. Vazhkudai, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 1015 - 1025

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Job placement plays a pivotal role in application performance on supercomputers. We present a multi-faceted exploration to influence placement in extreme-scale systems, to improve network performance and decrease variability. In our first exploration, Scores, we developed a machine learning model that extracts features from a job's node-allocation and grades performance. This identified several important...

chapter

Extended Task Queuing: Active Messages for Heterogeneous Systems

Michael LeBeane, Brandon Potter, Abhisek Pan, Alexandru Dutu, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 933 - 944

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Accelerators have emerged as an important component of modern cloud, datacenter, and HPC computing environments. However, launching tasks on remote accelerators across a network remains unwieldy, forcing programmers to send data in large chunks to amortize the transfer and launch overhead. By combining advances in intra-node accelerator unification with one-sided Remote Direct Memory Access (RDMA)...

chapter

Author index

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 1026 - 1032

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Presents an index of the authors whose articles are published in the conference proceedings record.

chapter

Unprotected Computing: A Large-Scale Study of DRAM Raw Error Rate on a Supercomputer

Leonardo Bautista-Gomez, Ferad Zyulkyarov, Osman Unsal, Simon McIntosh-Smith

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 645 - 655

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Supercomputers offer new opportunities for scientific computing as they grow in size. However, their growth also poses new challenges. Resilience has been recognized as one of the most pressing issues to solve for extreme scale computing. Transistor scaling in the single-digit nanometer era and power constraints might dramatically increase the failure rate of next generation machines. DRAM errors...

chapter

PIPES: A Language and Compiler for Task-Based Programming on Distributed-Memory Clusters

Martin Kong, Louis-Noel Pouchet, P. Sadayappan, Vivek Sarkar

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 456 - 467

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Applications running on clusters of shared-memory computers are often implemented using OpenMP+MPI. Productivity can be vastly improved using task-based programming, a paradigm where the user expresses the data and control-flow relations between tasks, offering the runtime maximal freedom to place and schedule tasks. While productivity is increased, high-performance execution remains challenging:...

chapter

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Thomas Grass, Cesar Allande, Adria Armejach, Alejandro Rico, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 526 - 537

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically not transparent to scientific programmers and system architects. Therefore, predicting the behavior of current scientific applications on future HPC infrastructures is a challenging...

chapter

Caliper: Performance Introspection for HPC Software Stacks

David Boehme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 550 - 560

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Many performance engineering tasks, from long-term performance monitoring to post-mortem analysis and online tuning, require efficient runtime methods for introspection and performance data collection. To understand interactions between components in increasingly modular HPC software, performance introspection hooks must be integrated into runtime systems, libraries, and application codes across the...

chapter

A Parallel Arbitrary-Order Accurate AMR Algorithm for the Scalar Advection-Diffusion Equation

Arash Bakhtiari, Dhairya Malhotra, Amir Raoofy, Miriam Mehl, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 514 - 525

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

We present a numerical method for solving the scalar advection-diffusion equation using adaptive mesh refinement. Our solver has three unique characteristics: (1) it supports arbitrary-order accuracy in space; (2) it allows different discretizations for the velocity and scalar advected quantity; (3) it combines the method of characteristics with an integral equation formulation; and (4) it supports...

chapter

A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers

Maxime Martinasso, Grzegorz Kwasniewski, Sadaf R. Alam, Thomas C. Schulthess, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 739 - 749

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

MeteoSwiss, the Swiss national weather forecast institute, has selected densely populated accelerator servers as their primary system to compute weather forecast simulation. Servers with multiple accelerator devices that are primarily connected by a PCI-Express (PCIe) network achieve a significantly higher energy efficiency. Memory transfers between accelerators in such a system are subjected to PCIe...

chapter

Watch Out for the Bully! Job Interference Study on Dragonfly Network

Xu Yang, John Jenkins, Misbah Mubarak, Robert B. Ross, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 750 - 760

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

High-radix, low-diameter dragonfly networks will be a common choice in next-generation supercomputers. Preliminary studies show that random job placement with adaptive routing should be the rule of thumb to utilize such networks, since it uniformly distributes traffic and alleviates congestion. Nevertheless, in this work we find that while random job placement coupled with adaptive routing is good...

chapter

Designing Scalable b-MATCHING Algorithms on Distributed Memory Multiprocessors by Approximation

Arif Khan, Alex Pothen, Md. Mostofa Ali Patwary, Mahantesh Halappanavar, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 773 - 783

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

A b-MATCHING is a subset of edges M such that at most b(v) edges in M are incident on each vertex v, where b(v) is specified. We present a distributed-memory parallel algorithm, b-SUITOR, that computes a b-MATCHING with more than half the maximum weight in a graph with weights on the edges. The approximation algorithm is designed to have high concurrency and low time complexity. We organize the implementation...

chapter

Accelerating Lattice QCD Multigrid on GPUs Using Fine-Grained Parallelization

M. A. Clark, Balint Joo, Alexei Strelchenko, Michael Cheng, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 795 - 806

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The past decade has witnessed a dramatic acceleration of lattice quantum chromodynamics calculations in nuclear and particle physics. This has been due to both significant progress in accelerating the iterative linear solvers using multigrid algorithms, and due to the throughput improvements brought by GPUs. Deploying hierarchical algorithms optimally on GPUs is non-trivial owing to the lack of parallelism...

chapter

Graph Colouring as a Challenge Problem for Dynamic Graph Processing on Distributed Systems

Scott Sallinen, Keita Iwabuchi, Suraj Poudel, Maya Gokhale, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 347 - 358

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

An unprecedented growth in data generation is taking place. Data about larger dynamic systems is being accumulated, capturing finer granularity events, and thus processing requirements are increasingly approaching real-time. To keep up, data-analytics pipelines need to be viable at massive scale, and switch away from static, offline scenarios to support fully online analysis of dynamic systems. This...

chapter

FlipBack: Automatic Targeted Protection against Silent Data Corruption

Xiang Ni, Laxmikant V. Kale

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 335 - 346

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The decreasing size of transistors has been critical to the increase in capacity of supercomputers. The smaller the transistors are, less energy is required to flip a bit, and thus silent data corruptions (SDCs) become more common. In this paper, we present FlipBack, an automatic software-based approach that protects applications from SDCs. FlipBack provides targeted protection for different types...

INFONA - science communication portal

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis