Search results

Items from 1 to 6 out of 6 results

article

The Unicorn Runtime: Efficient Distributed Shared Memory Programming for Hybrid CPU-GPU Clusters

Tarun Beri, Sorav Bansal, Subodh Kumar

IEEE Transactions on Parallel and Distributed Systems > 2017 > 28 > 5 > 1518 - 1534

Programming hybrid CPU-GPU clusters is hard. This paper addresses this difficulty and presents the design and runtime implementation of <bold/><bold>Unicorn</bold><bold/>—a parallel programming model for hybrid CPU-GPU clusters. In particular, this paper proves that efficient distributed shared memory style programing is possible and its simplicity can be retained across CPUs...

chapter

Phase-Based Profiling in GPGPU Kernels

Robert Dietrich, Felix Schmitt, Rene Widera, Michael Bussmann

2012 41st International Conference on Parallel Processing Workshops > 414 - 423

2012 41st International Conference on Parallel Processing Workshops (ICPPW)

More and more computationally intensive scientific applications make use of hardware accelerators like general purpose graphics processing units (GPGPUs). Compared to software development for typical multi-core processors their programming is fairly complex and needs hardware specific optimizations to utilize the full computing power. To achieve high performance, critical parts of a program have to...

chapter

Directive-based Programming for GPUs: A Comparative Study

Ruym'n Reyes, Ivan Lopez, Juan J. Fumero, Francisco de Sande

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems > 410 - 417

2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)

GPUs and other accelerators are available on many different devices, while GPGPU has been massively adopted by the HPC research community. Although a plethora of libraries and applications providing GPU support are available, the need of implementing new algorithms from scratch, or adapting sequential programs to accelerators, will always exist. Writing CUDA or OpenCL codes, although an easier task...

chapter

Communication Library to Overlap Computation and Communication for OpenCL Application

Toshiya Komoda, Shinobu Miwa, Hiroshi Nakamura

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 567 - 573

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

User-friendly parallel programming environments, such as CUDA and OpenCL are widely used for accelerators. They provide programmers with useful APIs, but the APIs are still low level primitives. Therefore, in order to apply communication optimization techniques, such as double buffering techniques, programmers have to manually write the programs with the primitives. Manual communication optimization...

chapter

Productive Programming of GPU Clusters with OmpSs

Javier Bueno, Judit Planas, Alejandro Duran, Rosa M. Badia, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium > 557 - 568

2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and heterogeneity for task parallelism. It is based on annotating a serial application with directives that...

chapter

Non-intrusive Performance Analysis of Parallel Hardware Accelerated Applications on Hybrid Architectures

R Dietrich, T Ilsche, G Juckeland

2010 39th International Conference on Parallel Processing Workshops > 135 - 143

2010 39th International Conference on Parallel Processing Workshops (ICPPW)

New high performance computing (HPC) applications recently have to face scalability over an increasing number of nodes and the programming of special accelerator hardware. Hybrid composition of large computing systems leads to a new dimension in complexity of software development. This paper presents a novel approach to gain insight into accelerator interaction and utilization without any changes...

Filter options

Data set:
ieee
Keywords:
KERNEL
RUNTIME
ACCELERATORS

Publication date

Set your own date range

Publication type

book (5)
article (1)

Keywords

PROGRAMMING (4)
GPGPU (3)
GRAPHICS PROCESSING UNIT (3)
CUDA (2)
HARDWARE (2)
INSTRUMENTS (2)
LIBRARIES (2)
OPENCL (2)
OPENMP (2)
OPTIMIZATION (2)
PERFORMANCE ANALYSIS (2)
SYNCHRONIZATION (2)
TRACING (2)
BULK SYNCHRONOUS PARALLELISM (1)
CLUSTER PROGRAMMING (1)
COHERENCE (1)
COMPILER (1)
COMPUTER ARCHITECTURE (1)
CUDA ENVIRONMENT (1)
DISTRIBUTED SYSTEM DESIGN (1)
DOUBLE BUFFERING (1)
EVENT LOGGING (1)
GPGPU COMPUTING (1)
GRAPHICS PROCESSING UNITS (1)
HIGH PERFORMANCE COMPUTING (1)
HPC APPLICATIONS (1)
HYBRID ARCHITECTURES (1)
HYBRID SIMULATION (1)
INSTRUCTION SETS (1)
LARGE COMPUTING SYSTEMS (1)
LOAD BALANCING (1)
MANY-CORE (1)
MEMORY MANAGEMENT (1)
MESSAGE SYSTEMS (1)
MONITORING (1)
MONITORING LIBRARIES (1)
NONINTRUSIVE PERFORMANCE ANALYSIS (1)
OPENACC (1)
OPENCL FRAMEWORK (1)
PARALLEL HARDWARE ACCELERATED APPLICATIONS (1)
PARALLEL PROCESSING (1)
PERFORMANCE EVALUATION (1)
PGI (1)
PROCESSOR SCHEDULING (1)
PRODUCTIVITY (1)
PROFILING (1)
RADIATION DETECTORS (1)
SCHEDULING (1)
SOFTWARE DEVELOPMENT (1)
SOFTWARE ENGINEERING (1)
STANDARDS (1)
STREAM GRAPH (1)
STREAMING MEDIA (1)
UNICORN RUNTIME (1)
more

INFONA - science communication portal

Search results

The Unicorn Runtime: Efficient Distributed Shared Memory Programming for Hybrid CPU-GPU Clusters

Phase-Based Profiling in GPGPU Kernels

Directive-based Programming for GPUs: A Comparative Study

Communication Library to Overlap Computation and Communication for OpenCL Application

Productive Programming of GPU Clusters with OmpSs

Non-intrusive Performance Analysis of Parallel Hardware Accelerated Applications on Hybrid Architectures

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options