Search results

Items from 1 to 5 out of 5 results

chapter

Statistical pattern based modeling of GPU memory access streams

Reena Panda, Xinnian Zheng, Jiajun Wang, Andreas Gerstlauer, more

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) > 1 - 6

2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC)

Recent research studies have shown that modern GPU performance is often limited by the memory system performance. Optimizing memory hierarchy performance requires GPU designers to draw design insights based on the cache & memory behavior of end-user applications. Unfortunately, it is often difficult to get access to end-user workloads due to the confidential or proprietary nature of the software/data...

chapter

Directive-Based Pipelining Extension for OpenMP

Xuewen Cui, Thomas R. W. Scogland, Bronis R. de Supinski, Wu-Chun Feng

2016 IEEE International Conference on Cluster Computing (CLUSTER) > 481 - 484

2016 IEEE International Conference on Cluster Computing (CLUSTER)

Programming models like CUDA, OpenMP, OpenACC and OpenCL are designed to offload compute-intensive workloads to accelerators efficiently. However, the naive offload model, which synchronously copies and executes in sequence, requires extensive hand-tuning of techniques, such as pipelining to overlap computation and communication. Therefore, we propose an easy-to-use, directive-based pipelining extension...

article

GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs

Jaekyu Lee, Dong Hyuk Woo, Hyesoon Kim, Mani Azimi

IEEE Transactions on Computers > 2015 > 64 > 11 > 3167 - 3180

As various graphics processing unit architectures are deployed across broad computing spectrum from a hand-held or embedded device to a high-performance computing server, OpenCL becomes the de facto standard programming environment for general-purpose computing on graphics processing units. Unlike its CPU counterpart, OpenCL has several distinct features such as its disciplined memory model, which...

chapter

An Evaluation of Unified Memory Technology on NVIDIA GPUs

Wenqiang Li, Guanghao Jin, Xuewen Cui, Simon See

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing > 1092 - 1098

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Unified Memory is an emerging technology which is supported by CUDA 6.X. Before CUDA 6.X, the existing CUDA programming model relies on programmers to explicitly manage data between CPU and GPU and hence increases programming complexity. CUDA 6.X provides a new technology which is called as Unified Memory to provide a new programming model that defines CPU and GPU memory space as a single coherent...

chapter

Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs

Pai-Wei Lai, Humayun Arafat, Venmugil Elango, P. Sadayappan

20th Annual International Conference on High Performance Computing > 139 - 148

2013 20th International Conference on High Performance Computing (HiPC)

In this paper, we report on the development of an efficient GPU implementation of the Strassen-Winograd matrix multiplication algorithm for matrices of arbitrary sizes. We utilize multi-kernel streaming to exploit concurrency across sub-matrix operations in addition to intra-operation parallelism. We evaluate the performance of the implementation in comparison with CUBLAS-5.0 on Fermi and Kepler GPUs...

Filter options

Data set:
ieee
Keywords:
KERNEL
GRAPHICS PROCESSING UNITS
MEMORY MANAGEMENT
COMPUTATIONAL MODELING

INFONA - science communication portal

Search results

Statistical pattern based modeling of GPU memory access streams

Directive-Based Pipelining Extension for OpenMP

GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs

An Evaluation of Unified Memory Technology on NVIDIA GPUs

Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Statistical pattern based modeling of GPU memory access streams

Directive-Based Pipelining Extension for OpenMP

GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs

An Evaluation of Unified Memory Technology on NVIDIA GPUs

Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options