2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

chapter

Message from the general chair

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > vii - ix

chapter

Message from the program chair

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > xi - xvi

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

chapter

Author index

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 1 - 6

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

chapter

Exploring architectural heterogeneity in intelligent vision systems

Nanchini Chandramoorthy, Giuseppe Tagliavini, Kevin Irick, Antonio Pullini, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 1 - 12

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Limited power budgets and the need for high performance computing have led to platform customization with a number of accelerators integrated with CMPs. In order to study customized architectures, we model four customization design points and compare their performance and energy across a number of computer vision workloads. We analyze the limitations of generic architectures and quantify the costs...

chapter

BeBoP: A cost effective predictor infrastructure for superscalar value prediction

Arthur Perais, Andre Seznec

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 13 - 25

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Up to recently, it was considered that a performance-effective implementation of Value Prediction (VP) would add tremendous complexity and power consumption in the pipeline, especially in the Out-of-Order engine and the predictor infrastructure. Despite recent progress in the field of Value Prediction, this remains partially true. Indeed, if the recent EOLE architecture proposition suggests that the...

chapter

VSR sort: A novel vectorised sorting algorithm & architecture extensions for future microprocessors

Timothy Hayes, Oscar Palomar, Osman Unsal, Adrian Cristal, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 26 - 38

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Sorting is a widely studied problem in computer science and an elementary building block in many of its subfields. There are several known techniques to vectorise and accelerate a handful of sorting algorithms by using single instruction-multiple data (SIMD) instructions. It is expected that the widths and capabilities of SIMD support will improve dramatically in future microprocessor generations...

chapter

Increasing multicore system efficiency through intelligent bandwidth shifting

Victor Jimenez, Alper Buyuktosunoglu, Pradip Bose, Francis P. O'Connell, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 39 - 50

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Memory bandwidth is a crucial resource in computing systems. Current CMP/SMT processors have a significant number of cores and they can run many threads concurrently. This large thread count adds high pressure to the memory bus, which demands high bandwidth to service memory requests from the cores. Hardware data prefetching is a well-known technique for hiding memory latency. Due to its speculative...

chapter

Exploiting compressed block size as an indicator of future reuse

Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 51 - 63

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

We introduce a set of new Compression-Aware Management Policies (CAMP) for on-chip caches that employ data compression. Our management policies are based on two key ideas. First, we show that it is possible to build a more efficient management policy for compressed caches if the compressed block size is directly used in calculating the value (importance) of a block to the cache. This leads to Minimal-Value...

chapter

Talus: A simple way to remove cliffs in cache performance

Nathan Beckmann, Daniel Sanchez

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 64 - 75

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Caches often suffer from performance cliffs: minor changes in program behavior or available cache space cause large changes in miss rate. Cliffs hurt performance and complicate cache management. We present Talus,¹ a simple scheme that removes these cliffs. Talus works by dividing a single application's access stream into two partitions, unlike prior work that partitions among competing applications...

chapter

Coordinated static and dynamic cache bypassing for GPUs

Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 76 - 88

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

The massive parallel architecture enables graphics processing units (GPUs) to boost performance for a wide range of applications. Initially, GPUs only employ scratchpad memory as on-chip memory. Recently, to broaden the scope of applications that can be accelerated by GPUs, GPU vendors have used caches in conjunction with scratchpad memory as on-chip memory in the new generations of GPUs. Unfortunately,...

chapter

Priority-based cache allocation in throughput processors

Dong Li, Minsoo Rhu, Daniel R. Johnson, Mike O'Connor, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 89 - 100

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

GPUs employ massive multithreading and fast context switching to provide high throughput and hide memory latency. Multithreading can Increase contention for various system resources, however, that may result In suboptimal utilization of shared resources. Previous research has proposed variants of throttling thread-level parallelism to reduce cache contention and improve performance. Throttling approaches...

chapter

Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory

Jungrae Kim, Michael Sullivan, Mattan Erez

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 101 - 112

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Growing computer system sizes and levels of integration have made memory reliability a primary concern, necessitating strong memory error protection. As such, large-scale systems typically employ error checking and correcting codes to trade redundant storage and bandwidth for increased reliability. While stronger memory protection will be needed to meet reliability targets in the future, it is undesirable...

chapter

XChange: A market-based approach to scalable dynamic multi-resource allocation in multicore architectures

Xiaodong Wang, Jose F. Martinez

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 113 - 125

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Efficiently allocating shared on-chip resources across cores is critical to optimize execution in chip multiprocessors (CMPs). Techniques proposed in the literature often rely on global, centralized mechanisms that seek to maximize system throughput. Global optimization may hurt scalability: as more cores are integrated on a die, the search space grows exponentially, making it harder to achieve optimal...

chapter

Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories

Mitesh R. Meswani, Sergey Blagodurov, David Roberts, John Slice, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 126 - 136

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Die-stacked DRAM is a technology that will soon be integrated in high-performance systems. Recent studies have focused on hardware caching techniques to make use of the stacked memory, but these approaches require complex changes to the processor and also cannot leverage the stacked memory to increase the system's overall memory capacity. In this work, we explore the challenges of exposing the stacked...

chapter

Event-based scheduling for energy-efficient QoS (eQoS) in mobile Web applications

Yuhao Zhu, Matthew Halpern, Vijay Janapa Reddi

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 137 - 149

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Mobile Web applications have become an integral part of our society. They pose a high demand for application quality of service (QoS). However, the energy-constrained nature of mobile devices makes optimizing for QoS difficult. Prior art on energy efficiency optimizations has only focused on the trade-off between raw performance and energy consumption, ignoring the application QoS characteristics...

chapter

Domain knowledge based energy management in handhelds

Nachiappan Chidambaram Nachiappan, Praveen Yedlapalli, Niranjan Soundararajan, Anand Sivasubramaniam, more

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 150 - 160

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Energy management in handheld devices is becoming a daunting task with the growing number of accelerators, increasing memory demands and high computing capacities required to support applications with stringent QoS needs. Current DVFS techniques that modulate power states of a single hardware component, or even recent proposals that manage multiple components, can lose out opportunities for attaining...

chapter

GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures

Jingwen Leng, Yazhou Zu, Vijay Janapa Reddi

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 161 - 173

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Energy efficiency is undoubtedly important for GPU architectures. Besides the traditionally explored energy-efficiency optimization techniques, exploiting the supply voltage guard-band remains a promising yet unexplored opportunity. Our hardware measurements show that up to 23% of the nominal supply voltage can be eliminated to improve CPU energy efficiency by as much as 25%. The key obstacle for...

chapter

Mascar: Speeding up GPU warps by reducing memory pitstops

Ankit Sethia, D. Anoushe Jamshidi, Scott Mahlke

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 174 - 185

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

With the prevalence of GPUs as throughput engines for data parallel workloads, the landscape of GPU computing is changing significantly. Non-graphics workloads with high memory intensity and irregular access patterns are frequently targeted for acceleration on GPUs. While GPUs provide large numbers of compute resources, the resources needed for memory intensive workloads are more scarce. Therefore,...

chapter

Hierarchical private/shared classification: The key to simple and efficient coherence for clustered cache hierarchies

Alberto Ros, Mahdad Davari, Stefanos Kaxiras

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) > 186 - 197

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Hierarchical clustered cache designs are becoming an appealing alternative for mulücores. Grouping cores and their caches in clusters reduces network congestion by localizing traffic among several hierarchical levels, potentially enabling much higher scalability. While such architectures can be formed recursively by replicating a base design pattern, keeping the whole hierarchy coherent requires more...

INFONA - science communication portal

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

Table of contents

Message from the general chair

Message from the program chair

Author index

Exploring architectural heterogeneity in intelligent vision systems

BeBoP: A cost effective predictor infrastructure for superscalar value prediction

VSR sort: A novel vectorised sorting algorithm & architecture extensions for future microprocessors

Increasing multicore system efficiency through intelligent bandwidth shifting

Exploiting compressed block size as an indicator of future reuse

Talus: A simple way to remove cliffs in cache performance

Coordinated static and dynamic cache bypassing for GPUs

Priority-based cache allocation in throughput processors

Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory

XChange: A market-based approach to scalable dynamic multi-resource allocation in multicore architectures

Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories

Event-based scheduling for energy-efficient QoS (eQoS) in mobile Web applications

Domain knowledge based energy management in handhelds

GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures

Mascar: Speeding up GPU warps by reducing memory pitstops

Hierarchical private/shared classification: The key to simple and efficient coherence for clustered cache hierarchies

Filter options

Publication date

Keywords

INFONA - science communication portal

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)