Search results

article

Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method

Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, more

IEEE Transactions on Parallel and Distributed Systems > 2017 > 28 > 10 > 2794 - 2807

With the advent of complex modern architectures, the low-level paradigms long considered sufficient to build High Performance Computing (HPC) numerical codes have met their limits. Achieving efficiency, ensuring portability, while preserving programming tractability on such hardware prompted the HPC community to design new, higher level paradigms while relying on runtime systems to maintain performance...

chapter

The Cloud as an OpenMP Offloading Device

Herve Yviquel, Guido Araujo

2017 46th International Conference on Parallel Processing (ICPP) > 352 - 361

2017 46th International Conference on Parallel Processing (ICPP)

Computation offloading is a programming model in which program fragments (e.g. hot loops) are annotated so that their execution is performed in dedicated hardware or accelerator devices. Although offloading has been extensively used to move computation to GPUs, through directive-based annotation standards like OpenMP, offloading computation to very large computer clusters can become a complex and...

chapter

Energy Efficiency Optimization of Task-Parallel Codes on Asymmetric Architectures

Luis Costero, Francisco D. Igual, Katzalin Olcoz, Francisco Tirado

2017 International Conference on High Performance Computing & Simulation (HPCS) > 402 - 409

2017 International Conference on High Performance Computing & Simulation (HPCS)

We present a family of policies that, integrated within a runtime task scheduler (Nanox), pursue the goal of improving the energy efficiency of task-parallel executions with no intervention from the programmer. The proposed policies tackle the problem by modifying the core operating frequency via DVFS mechanisms, or by enabling/disabling the mapping of tasks to specific cores at selected execution...

chapter

Self-Aware Context in Smart Home Pervasive Platforms

Philippe Lalanda, Eva Gerber-Gaillard, Stephanie Chollet

2017 IEEE International Conference on Autonomic Computing (ICAC) > 119 - 124

2017 IEEE International Conference on Autonomic Computing (ICAC)

Pervasive computing envisions environments where computers are blended into everyday objects in order to provide added-value services to people. A growing number of advanced embedded systems, extended with computing and communication capabilities, are already appearing around us. However, pervasive applications raise major challenges in terms of software engineering and remain hard to develop, deploy,...

chapter

Data Centric Performance Measurement Techniques for Chapel Programs

Hui Zhang, Jeffrey K. Hollingsworth

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 377 - 386

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Chapel is an emerging PGAS (Partitioned Global Address Space) language whose design goal is to make parallel programming more productive and generally accessible. To date, the implementation effort has focused primarily on correctness over performance. We present a performance measurement technique for Chapel and the idea is also applicable to other PGAS models. The unique feature of our tool is that...

chapter

Combining Both a Component Model and a Task-Based Model for HPC Applications: A Feasibility Study on GYSELA

Olivier Aumage, Julien Bigot, Helene Coullon, Christian Perez, more

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 635 - 644

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

This paper studies the feasibility of efficiently combining both a software component model and a task-based model. Task based models are known to enable efficient executions on recent HPC computing nodes while component models ease the separation of concerns of application and thus improve their modularity and adaptability. This paper describes a prototype version of the COMET programming model combining...

chapter

Comparison of Threading Programming Models

Solmaz Salehian, Jiawen Liu, Yonghong Yan

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 766 - 774

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper, we provide comparison of languagefeatures and runtime systems of commonly used threadingparallel programming models for high performance computing, including OpenMP, Intel Cilk Plus, Intel TBB, OpenACC, NvidiaCUDA, OpenCL, C++11 and PThreads. We then report ourperformance comparison of OpenMP, Cilk Plus and C++11 fordata and task parallelism on CPU using benchmarks. The resultsshow...

chapter

HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems

Yonghong Yan, Jiawen Liu, Kirk W. Cameron, Mariam Umar

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 788 - 798

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Heterogeneous computing systems, e.g., those with accelerators than the host CPUs, offer the accelerated performance for a variety of workloads. However, most parallel programming models require platform dependent, time-consuming hand-tuning efforts for collectively using all the resources in a system to achieve efficient results. In this work, we explore the use of OpenMP parallel language extensions...

chapter

A scalable and composable map-reduce system

Mahwish Arif, Hans Vandierendonck, Dimitrios S. Nikolopoulos, Bronis R. de Supinski

2016 IEEE International Conference on Big Data (Big Data) > 2233 - 2242

2016 IEEE International Conference on Big Data (Big Data)

This paper presents a novel map-reduce runtime system that is designed for scalability and for composition with other parallel software. We use a modified programming interface that expresses reduction operations over data containers as opposed to key-value pairs. This design choice admits higher efficiency as the programmer can select appropriate data structures. Our runtime targets shared memory...

chapter

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Thomas Grass, Cesar Allande, Adria Armejach, Alejandro Rico, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 526 - 537

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically not transparent to scientific programmers and system architects. Therefore, predicting the behavior of current scientific applications on future HPC infrastructures is a challenging...

chapter

Keynote

Thomas Sterling

2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2) > 1

2016 Second International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)

chapter

Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale

Cy P Chan, John D Bachan, Joseph P Kenny, Jeremiah J Wilke, more

2016 First International Workshop on Communication Optimizations in HPC (COMHPC) > 17 - 28

2016 First International Workshop on Communication Optimizations in HPC (COMHPC)

We introduce a topology-aware performance optimization and modeling workflow for AMR simulation that includes two new modeling tools, ProgrAMR and Mota Mapper, which interface with the BoxLib AMR framework and the SSTmacro network simulator. ProgrAMR allows us to generate and model the execution of task dependency graphs from high-level specifications of AMR-based applications, which we demonstrate...

chapter

Metaprogramming-Enabled Parallel Execution of Apparently Sequential C++ Code

David S. Hollman, Janine C. Bennett, Hemanth Kolla, Jonathan Lifflander, more

2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2) > 24 - 31

2016 Second International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)

Task-based execution models have received considerable attention in recent years to meet the performance challenges facing high-performance computing (HPC). In this paper we introduce MetaPASS — Metaprogramming-enabled Para-llelism from Apparently Sequential Semantics — a proof-of-concept, non-intrusive header library that enables implicit task-based parallelism in a sequential C++ code. MetaPASS...

chapter

PGAS Communication Runtime for Extreme Large Data Computation

Ryo Matsumiya, Toshio Endo

2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2) > 10 - 16

2016 Second International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)

For partitioned global address space (PGAS) runtimes, supporting out-of-core data computation is an important issue. Some researchers showed that flash SSDs are useful for out-of-core data computation.In this paper, we introduce ComEx-PM, a PGAS communication runtime. ComEx-PM supports out-of-core data computation using a flash SSD. ComEx-PM launched multiple processes in each node. Memory region...

chapter

Automatic Code Generation and Data Management for an Asynchronous Task-Based Runtime

Muthu Baskaran, Benoit Pradelle, Benoit Meister, Athanasios Konstantinidis, more

2016 5th Workshop on Extreme-Scale Programming Tools (ESPT) > 34 - 41

2016 5th Workshop on Extreme-Scale Programming Tools (ESPT)

Hardware scaling and low-power considerations associated with the quest for exascale and extreme scale computing are driving system designers to consider new runtime and execution models such as the event-driven-task (EDT) models that enable more concurrency and reduce the amount of synchronization. Further, for performance, productivity, and code sustainability reasons, there is an increasing demand...

chapter

A new model for programming distributed computer based on GPU chip and mobile Agent

Fakhi Hicham, Youssfi Mohammed, Bouattane Omar, Ouajji Hassan

2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA) > 1 - 6

2016 11th International Conference on Intelligent Systems: Theories and Applications (SITA)

In this paper, we present a new model to describe and program a parallel clusters using graphic processing unit, multi agent and distributed systems. The model are based physically on a multitude of computer nodes arranged and coupled according to the paradigm and topology of multi agent system. Basing on the agent modelling technique and on the java and C/C++ language, we develop a framework to build...

chapter

PY-PITS: A Scalable Python Runtime System for the Computation of Partially Idempotent Tasks

Edson Borin, Caian Benedicto, Ian L. Rodrigues, Flavia Pisani, more

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) > 7 - 12

2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)

The popularization of multi-core architectures and cloud services has allowed users access to high performance computing infrastructures. However, programming for these systems might be cumbersome due to challenges involving system failures, load balancing, and task scheduling. Aiming at solving these problems, we previously introduced SPITS, a programming model and reference architecture for executing...

chapter

The Open Community Runtime: A runtime system for extreme scale computing

Timothy G. Mattson, Romain Cledat, Vincent Cave, Vivek Sarkar, more

2016 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2016 IEEE High Performance Extreme Computing Conference (HPEC)

The Open Community Runtime (OCR) is a new runtime system designed to meet the needs of extreme-scale computing. While there is growing support for the idea that future execution models will be based on dynamic tasks, there is little agreement on what else should be included. OCR minimally adds events for synchronization and relocatable data-blocks for data management to form a complete system that...

chapter

Remote sensing data processing acceleration based on multi-core processors

Xiao Zheng, Yong Xue, Jie Guang, Jia Liu

2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) > 641 - 644

IGARSS 2016 - 2016 IEEE International Geoscience and Remote Sensing Symposium

With the spatial, spectral and temporal resolutions of remote sensing data increasing, the computing efficiency becomes one of bottlenecks for remote sensing image data processing, especially for that with time response requirements. In this paper, towards the aerosol optical depth retrieval application from moderate resolution imaging spectroradiometer data, taking the time-consuming interpolation...

chapter

Towards Asynchronous Many-Task in Situ Data Analysis Using Legion

Philippe Pebay, Janine C. Bennett, David Hollman, Sean Treichler, more

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1033 - 1037

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

We explore the use of asynchronous many-task (AMT) programming models for the implementation of in situ analysis towards the goal of maximizing programmer productivity and overall performance on next generation platforms. We describe how a broad class of statistics algorithms can be transformed from a traditional single-programm multiple-data (SPMD) implementation to an AMT implementation, demonstrating...

INFONA - science communication portal

Search results

Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method

The Cloud as an OpenMP Offloading Device

Energy Efficiency Optimization of Task-Parallel Codes on Asymmetric Architectures

Self-Aware Context in Smart Home Pervasive Platforms

Data Centric Performance Measurement Techniques for Chapel Programs

Combining Both a Component Model and a Task-Based Model for HPC Applications: A Feasibility Study on GYSELA

Comparison of Threading Programming Models

HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems

A scalable and composable map-reduce system

MUSA: A Multi-level Simulation Approach for Next-Generation HPC Machines

Keynote

Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale

Metaprogramming-Enabled Parallel Execution of Apparently Sequential C++ Code

PGAS Communication Runtime for Extreme Large Data Computation

Automatic Code Generation and Data Management for an Asynchronous Task-Based Runtime

A new model for programming distributed computer based on GPU chip and mobile Agent

PY-PITS: A Scalable Python Runtime System for the Computation of Partially Idempotent Tasks

The Open Community Runtime: A runtime system for extreme scale computing

Remote sensing data processing acceleration based on multi-core processors

Towards Asynchronous Many-Task in Situ Data Analysis Using Legion

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options