Akhil Arunkumar

chapter

MCM-GPU: Multi-chip-module GPUs for continued performance scalability

Akhil Arunkumar, Evgeny Bolotin, Benjamin Cho, Ugljesa Milic, more

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) > 320 - 332

2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)

Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single monolithic GPUs will ultimately plateau. However, the need for higher performing GPUs continues to exist in many domains. To address this need, in this...

article

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems

Hsing-Min Chen, Supreet Jeloka, Akhil Arunkumar, David Blaauw, more

IEEE Transactions on Computers > 2016 > 65 > 12 > 3766 - 3779

Most server-grade systems provide Chipkill-Correct error protection at the expense of power and performance. In this paper we present a low overhead solution to improving the reliability of commodity DRAM systems with no change in the existing memory architecture. Specifically, we propose five erasure and error correction (E-ECC) schemes that provide at least Chipkill-Correct protection for x4 (Schemes...

chapter

ID-cache: instruction and memory divergence based cache management for GPUs

Akhil Arunkumar, Shin-ying Lee, Carole-jean Wu

2016 IEEE International Symposium on Workload Characterization (IISWC) > 1 - 10

2016 IEEE International Symposium on Workload Characterization (IISWC)

Modern graphic processing units (GPUs) are not only able to perform graphics rendering, but also perform general purpose parallel computations (GPGPUs). It has been shown that the GPU L1 data cache and the on chip interconnect bandwidth are important sources of performance bottlenecks and inefficiencies in GPGPUs. Through this work, we aim to understand the sources of inefficiencies and possible opportunities...

chapter

Characterization and Throttling-Based Mitigation of Memory Interference for Heterogeneous Smartphones

Davesh Shingari, Akhil Arunkumar, Carole-Jean Wu

2015 IEEE International Symposium on Workload Characterization > 22 - 33

2015 IEEE International Symposium on Workload Characterization (IISWC)

The availability of a wide range of general purpose as well as accelerator cores on modern smart phones means that a significant number of applications can be executed on a smart phone simultaneously, resulting in an ever increasing demand on the memory subsystem. While the increased computation capability is intended for improving user experience, memory requests from each concurrent application...

chapter

CAWA: Coordinated warp scheduling and Cache Prioritization for critical warp acceleration of GPGPU workloads

Shin-Ying Lee, Akhil Arunkumar, Carole-Jean Wu

2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA) > 515 - 527

2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA)

The ubiquity of graphics processing unit (GPU) architectures has made them efficient alternatives to chipmultiprocessors for parallel workloads. GPUs achieve superior performance by making use of massive multi-threading and fast context-switching to hide pipeline stalls and memory access latency. However, recent characterization results have shown that general purpose GPU (GPGPU) applications commonly...

chapter

ReMAP: Reuse and memory access cost aware eviction policy for last level cache management

Akhil Arunkumar, Carole-Jean Wu

2014 IEEE 32nd International Conference on Computer Design (ICCD) > 110 - 117

2014 32nd IEEE International Conference on Computer Design (ICCD)

To mitigate the significant main memory access latency in modern chip multiprocessors, multi-level on-chip caches are used to bridge the gap by retaining frequently used data closer to the processor cores. Such dependence on the last-level cache (LLC) has motivated numerous innovations in cache management schemes. However, most prior works focus their efforts on optimizing cache miss counts experienced...

chapter

Estimating correlation for a real-time measure of connectivity

Akhil Arunkumar, Ashish Panday, Bharat Joshi, Arun Ravindran, more

2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society > 5190 - 5193

2012 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

There has recently been considerable interest in connectivity analysis of fMRI and scalp and intracranial EEG time-series. The computational requirements of the pair-wise correlation (PWC), the core time-series measure used to estimate connectivity, presents a challenge to the real-time estimation of the PWC between all pairs of multiple time-series. We describe a parallel algorithm for computing...

INFONA - science communication portal

Search results for: Akhil Arunkumar

MCM-GPU: Multi-chip-module GPUs for continued performance scalability

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems

ID-cache: instruction and memory divergence based cache management for GPUs

Characterization and Throttling-Based Mitigation of Memory Interference for Heterogeneous Smartphones

CAWA: Coordinated warp scheduling and Cache Prioritization for critical warp acceleration of GPGPU workloads

ReMAP: Reuse and memory access cost aware eviction policy for last level cache management

Estimating correlation for a real-time measure of connectivity

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results for: Akhil Arunkumar

MCM-GPU: Multi-chip-module GPUs for continued performance scalability

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems

ID-cache: instruction and memory divergence based cache management for GPUs

Characterization and Throttling-Based Mitigation of Memory Interference for Heterogeneous Smartphones

CAWA: Coordinated warp scheduling and Cache Prioritization for critical warp acceleration of GPGPU workloads

ReMAP: Reuse and memory access cost aware eviction policy for last level cache management

Estimating correlation for a real-time measure of connectivity

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options