2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

chapter

Introduction to CHIUW Workshop

Tom MacDonald, Michael Ferguson

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1083 - 1084

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

Large-Scale Stochastic Learning Using GPUs

Thomas Parnell, Celestine Duenner, Kubilay Atasu, Manolis Sifalakis, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 419 - 428

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this work we propose an accelerated stochastic learning system for very large-scale applications. Acceleration is achieved by mapping the training algorithm onto massively parallel processors: we demonstrate a parallel, asynchronous GPU implementation of the widely used stochastic coordinate descent/ascent algorithm that can provide up to 35× speed-up over a sequential CPU implementation. In order...

chapter

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

Nitin A. Gawande, Joshua B. Landwehr, Jeff A. Daily, Nathan R. Tallent, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 399 - 408

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors — including NVIDIA, Intel, AMD and IBM — have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these...

chapter

The New Large-Scale RNNLM System Based on Distributed Neuron

Dejiao Niu, Rui Xue, Tao Cai, Hai Li, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 433 - 436

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

RNNLM (Recurrent Neural Network Language Model) can save the historical information of the training dataset by the last hidden layer and can also as input for training. It has become an interesting topic in the field of Natural Language Processing research. However, the immense training time overhead is a big problem. The large output layer, hidden layer, last hidden layer and the connections among...

chapter

Distributed and in-Situ Machine Learning for Smart-Homes and Buildings: Application to Alarm Sounds Detection

Amaury Durand, Yanik Ngoko, Christophe Cerin

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 429 - 432

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

We consider the implementation of an in-situ machine learning system with the computing model promoted by Qarnot computing. Qarnot introduced an utility computing model in which servers are distributed in homes and offices where they serve as heaters. The Qarnot servers also embed several sensors for temperature, humidity, CO 2 etc. Qarnot offers an adequate platform to develop in-situ workflows for...

chapter

Introduction to PDCO Workshop

Gregoire Danoy, Didier El Baz

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 441

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

A Parallel Approximation Algorithm for Scheduling Parallel Identical Machines

Laleh Ghalami, Daniel Grosu

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 442 - 451

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

We present the design and analysis of a parallel approximation algorithm for the problem of scheduling jobs on parallel identical machines to minimize makespan. The design of the parallel approximation algorithm is based on the best existing polynomial-time approximation scheme (PTAS) for the problem. To the best of our knowledge, this is the first practical parallel approximation algorithm for the...

chapter

Introduction to HiCOMB Workshop

Alex Pothen, Ananth Grama

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 251

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA

Emanuele Del Sozzo, Lorenzo Di Tucci, Marco D. Santambrogio

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 241 - 246

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

N-Body simulation simulates the evolution of a system that is composed of N particles, where each element receives a force that is due to the interaction with all the other elements within the system. Usually, the influence of external physical forces, such as gravity, is involved too. This methodology is widely used in different fields that range from astrophysics, where it is used to study the interaction...

chapter

Out-of-Order Execution of Buffered Function Units in Exposed Data Path Architectures

Tripti Jain, Klaus Schneider, Frederik Walk

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 229 - 234

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Some of the newer processor architectures are no longer based on registers in order to increase their potential of instruction-level parallelism. Instead, they expose their data paths to the compiler so that the program is able to directly move data values between function units using suitable instructions. Some of these architectures require a synchronous transfer of data values while others use...

chapter

A Near Optimal Integrated Solution for Resource Constrained Scheduling, Binding and Routing on CGRAs

Tajas Ruschke, Lukas Johannes Jung, Christian Hochberger

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 213 - 218

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Mapping problems to Coarse Grained Reconfigurable Arrays (CGRA) has been researched for many years now. Yet, no feasible mapping algorithms are known that can be considered optimal or even near optimal. The main reason for this deficit is the complex nature of the mapping problem. It can be considered as a combined scheduling, binding and routing problem. It involves several constraints that need...

chapter

Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications

Franziska Hoffeins, Florina M. Ciorba, Ioana Banicescu

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1579 - 1587

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Reproducibility of the execution of scientific applications on parallel and distributed systems is a growing concern, underlying the trustworthiness of the experiments and the conclusions derived from experiments. Dynamic loop scheduling (DLS) techniques are an effective approach towards performance improvement of scientific applications via load balancing. These techniques address algorithmic and...

chapter

Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques

Zahra Khatami, Hartmut Kaiser, J. Ramanujam

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1198 - 1207

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Maximizing parallelism level in applications can be achieved by minimizing overheads due to load imbalances and waiting time due to memory latencies. Compiler optimization is one of the most effective solutions to tackle this problem. The compiler is able to detect the data dependencies in an application and is able to analyze the specific sections of code for parallelization potential. However, all...

chapter

Scalable Hierarchical Multipole Methods Using an Asynchronous Many-Tasking Runtime System

Jackson DeBuhr, Bo Zhang, Luke DAlessandro

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1226 - 1234

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Hierarchical Multipole Methods (HMMs) are an important class of methods in scientific and engineering applications. They are challenging to parallelize for contemporary and emerging platforms using existing programming models. Asynchronous many-tasking (AMT) execution models provide abstractions suitable for HMMs and promise scalability in the context of future exascale systems. In our work we (1)...

chapter

An Application-Aware Data Replacement Policy for Interactive Large-Scale Scientific Visualization

Lina Yu, Hongfeng Yu, Hong Jiang, Jun Wang

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1216 - 1225

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The unprecedented amounts of data generated from large scientific simulations impose a grand challenge in data analytics, and I/O simply becomes a major performance bottleneck. To address this challenge, we present an application-aware I/O optimization technique in support of interactive large-scale scientific visualization. We partition a scientific data into blocks, and carefully place data blocks...

chapter

Architecting the Discontinuous Deformation Analysis Method Pipeline on the GPU

Yunfan Xiao, Min Huang, Qinghai Miao, Jun Xiao, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1188 - 1197

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

As an important numerical analysis method of rock mechanics, discontinuous deformation analysis (DDA) has been widely used in rock engineering. DDA has certain advantages such as the large time step and the large deformation, at the cost of relatively low computing efficiency. To address the efficiency bottleneck of DDA, this paper proposes a complete graphics processing unit (GPU)-based version....

chapter

Efficient Data Structures for a Hybrid Parallel and Vectorized Particle-in-Cell Code

Yann Barsamian, Sever A. Hirstoaga, Eric Violard

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1168 - 1177

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The contribution of the present work relies on an innovative and judicious combination of several optimization techniques for achieving high performance when using automatic vectorization and hybrid MPI/OpenMP parallelism in a Particle-in-Cell (PIC) code. The domain of application is plasma physics: the code simulates 2d2v Vlasov-Poisson systems on Cartesian grids with periodic boundary conditions...

chapter

An Analysis of Resilience Techniques for Exascale Computing Platforms

Daniel Dauwe, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 914 - 923

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

With the increase in the complexity and number of nodes in large-scale high performance computing (HPC) systems, the probability of applications experiencing failures has increased significantly. As the computational demands of applications that execute on HPC systems increase, projections indicate that applications executing on exascale-sized systems are likely to operate with a mean time between...

chapter

HPPAC Keynote Talk

Kirk W. Cameron

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 953

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Provides an abstract of the keynote presentation and a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.

chapter

Using LLVM for Optimized Lightweight Binary Re-Writing at Runtime

Alexis Engelke, Josef Weidendorfer

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 785 - 794

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Providing new parallel programming models/abstractions as a set of library functions has the huge advantage that it allows for an relatively easy incremental porting path for legacy HPC applications, in contrast to the huge effort needed when novel concepts are only provided in new programming languages or language extensions. However, performance issues are to be expected with fine granular usage...

INFONA - science communication portal

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Introduction to CHIUW Workshop

Large-Scale Stochastic Learning Using GPUs

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

The New Large-Scale RNNLM System Based on Distributed Neuron

Distributed and in-Situ Machine Learning for Smart-Homes and Buildings: Application to Alarm Sounds Detection

Introduction to PDCO Workshop

A Parallel Approximation Algorithm for Scheduling Parallel Identical Machines

Introduction to HiCOMB Workshop

A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA

Out-of-Order Execution of Buffered Function Units in Exposed Data Path Architectures

A Near Optimal Integrated Solution for Resource Constrained Scheduling, Binding and Routing on CGRAs

Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications

Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques

Scalable Hierarchical Multipole Methods Using an Asynchronous Many-Tasking Runtime System

An Application-Aware Data Replacement Policy for Interactive Large-Scale Scientific Visualization

Architecting the Discontinuous Deformation Analysis Method Pipeline on the GPU

Efficient Data Structures for a Hybrid Parallel and Vectorized Particle-in-Cell Code

An Analysis of Resilience Techniques for Exascale Computing Platforms

HPPAC Keynote Talk

Using LLVM for Optimized Lightweight Binary Re-Writing at Runtime

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)