2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

chapter

Cache restoration for highly partitioned virtualized systems

David Daly, Harold W. Cain

IEEE International Symposium on High-Performance Comp Architecture > 1 - 10

The economics of server consolidation have led to the support of virtualization features in almost all server-class systems, with the related feature set being a subject of significant competition. While most systems allow for partitioning at the relatively coarse grain of a single core, some systems also support multiprogrammed virtualization, whereby a system can be more finely partitioned through...

chapter

Power balanced pipelines

John Sartori, Ben Ahrens, Rakesh Kumar

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Since the onset of pipelined processors, balancing the delay of the microarchitectural pipeline stages such that each microarchitectural pipeline stage has an equal delay has been a primary design objective, as it maximizes instruction throughput. Unfortunately, this causes significant energy inefficiency in processors, as each microarchitectural pipeline stage gets the same amount of time to complete,...

chapter

Flexible register management using reference counting

Steven Battle, Andrew D. Hilton, Mark Hempstead, Amir Roth

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Conventional out-of-order processors that use a unified physical register file allocate and reclaim registers explicitly using a free list that operates as a circular queue. We describe and evaluate a more flexible register management scheme — reference counting. We implement reference counting using a bit-matrix with a column for every physical register and a row for every entity that can hold a...

chapter

Decoupled dynamic cache segmentation

Samira M. Khan, Zhe Wang, Daniel A. Jimenez

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level caches. We propose a cache segmentation technique that dynamically adapts to cache access patterns by predicting the best number of not-yet-referenced and already-referenced blocks in the cache. This technique is independent...

chapter

Network congestion avoidance through Speculative Reservation

Nan Jiang, Daniel U. Becker, George Michelogiannakis, William J. Dally

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Congestion caused by hot-spot traffic can significantly degrade the performance of a computer network. In this study, we present the Speculative Reservation Protocol (SRP), a new network congestion control mechanism that relieves the effect of hot-spot traffic in high bandwidth, low latency, lossless computer networks. Compared to existing congestion control approaches like Explicit Congestion Notification...

chapter

Accelerating business analytics applications

Valentina Salapura, Tejas Karkhanis, Priya Nagpurkar, Jose Moreira

IEEE International Symposium on High-Performance Comp Architecture > 1 - 10

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Business text analytics applications have seen rapid growth, driven by the mining of data for various decision making processes. Regular expression processing is an important component of these applications, consuming as much as 50% of their total execution time. While prior work on accelerating regular expression processing has focused on Network Intrusion Detection Systems, business analytics applications...

chapter

Architectural support for synchronization-free deterministic parallel programming

Cedomir Segulja, Tarek S. Abdelrahman

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

We propose a novel synchronization mechanism called versioning. It dynamically establishes a deterministic order of memory accesses in parallel programs that have serial semantics, in a way that is transparent to the programmer. This order is created in a distributed manner and is enforced by monitoring memory accesses and stalling threads if necessary. Versioning gives rise to parallel programming...

chapter

QuickIA: Exploring heterogeneous architectures on real prototypes

Nagabhushan Chitlur, Ganapati Srinivasa, Scott Hahn, P K Gupta, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 8

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Over the last decade, homogeneous multi-core processors emerged and became the de-facto approach for offering high parallelism, high performance and scalability for a wide range of platforms. We are now at an interesting juncture where several critical factors (smaller form factor devices, power challenges, need for specialization, etc) are guiding architects to consider heterogeneous chips and platforms...

chapter

Architectural perspectives of future wireless base stations based on the IBM PowerEN™ processor

Augusto Vega, Pradip Bose, Alper Buyuktosunoglu, Jeff Derby, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 10

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

In wireless networks, base stations are responsible for operating on large amounts of traffic at high speed rates. With the advent of new standards, as 4G, further pressure is put in the hardware requirements to satisfy speeds of up to 1 Gbps. In this work, we study the applicability and potential benefits of the IBM PowerEN processor (a multi-core, massively multithreaded platform) in the realm of...

chapter

Statistical performance comparisons of computers

Tianshi Chen, Yunji Chen, Qi Guo, Olivier Temam, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

As a fundamental task in computer architecture research, performance comparison has been continuously hampered by the variability of computer performance. In traditional performance comparisons, the impact of performance variability is usually ignored (i.e., the means of performance measurements are compared regardless of the variability), or in the few cases where it is factored in using parametric...

chapter

MACAU: A Markov model for reliability evaluations of caches under Single-bit and Multi-bit Upsets

Jinho Suh, Murali Annavaram, Michel Dubois

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Due to the growing trend that a Single Event Upset (SEU) can cause spatial Multi-Bit Upsets (MBUs), the effects of spatial MBUs has recently become an important yet very challenging issue, especially in large, last-level caches (LLCs) protected by protection codes. In the presence of spatial MBUs, the strength of the protection codes becomes a critical design issue. Developing a reliability model...

chapter

Improving write operations in MLC phase change memory

Lei Jiang, Bo Zhao, Youtao Zhang, Jun Yang, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 10

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Phase change memory (PCM) recently has emerged as a promising technology to meet the fast growing demand for large capacity memory in modern computer systems. In particular, multi-level cell (MLC) PCM that stores multiple bits in a single cell, offers high density with low per-byte fabrication cost. However, despite many advantages, such as good scalability and low leakage, PCM suffers from exceptionally...

chapter

System-level implications of disaggregated memory

Kevin Lim, Yoshio Turner, Jose Renato Santos, Alvin AuYoung, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Recent research on memory disaggregation introduces a new architectural building block — the memory blade — as a cost-effective approach for memory capacity expansion and sharing for an ensemble of blade servers. Memory blades augment blade servers' local memory capacity with a second-level (remote) memory that can be dynamically apportioned among blades in response to changing capacity demand, albeit...

chapter

TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture

Jaekyu Lee, Hyesoon Kim

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Combining CPUs and GPUs on the same chip has become a popular architectural trend. However, these heterogeneous architectures put more pressure on shared resource management. In particular, managing the last-level cache (LLC) is very critical to performance. Lately, many researchers have proposed several shared cache management mechanisms, including dynamic cache partitioning and promotion-based cache...

chapter

Quasi-nonvolatile SSD: Trading flash memory nonvolatility to improve storage system performance for enterprise applications

Yangyang Pan, Guiqiang Dong, Qi Wu, Tong Zhang

IEEE International Symposium on High-Performance Comp Architecture > 1 - 10

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

This paper advocates a quasi-nonvolatile solid-state drive (SSD) design strategy for enterprise applications. The basic idea is to trade data retention time of NAND flash memory for other system performance metrics including program/erase (P/E) cycling endurance and memory programming speed, and meanwhile use explicit internal data refresh to accommodate very short data retention time (e.g., few weeks...

chapter

SCD: A scalable coherence directory with flexible sharer set encoding

Daniel Sanchez, Christos Kozyrakis

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Large-scale CMPs with hundreds of cores require a directory-based protocol to maintain cache coherence. However, previously proposed coherence directories are hard to scale beyond tens of cores, requiring either excessive area or energy, complex hierarchical protocols, or inexact representations of sharer sets that increase coherence traffic and degrade performance. We present SCD, a scalable coherence...

chapter

Dynamically heterogeneous cores through 3D resource pooling

Houman Homayoun, Vasileios Kontorinis, Amirali Shayan, Ta-Wei Lin, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

This paper describes an architecture for a dynamically heterogeneous processor architecture leveraging 3D stacking technology. Unlike prior work in the 2D plane, the extra dimension makes it possible to share resources at a fine granularity between vertically stacked cores. As a result, each core can grow or shrink resources, as needed by the code running on the core. This architecture, therefore,...

chapter

JETC: Joint energy thermal and cooling management for memory and CPU subsystems in servers

Raid Ayoub, Rajib Nath, Tajana Rosing

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

In this work we propose a joint energy, thermal and cooling management technique (JETC) that significantly reduces per server cooling and memory energy costs. Our analysis shows that decoupling the optimization of cooling energy of CPU & memory and the optimization of memory energy leads to suboptimal solutions due to thermal dependencies between CPU and memory and non-linearity in cooling energy...

chapter

Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs

Karthik T. Sundararajan, Vasileios Porpodas, Timothy M. Jones, Nigel P. Topham, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Intelligently partitioning the last-level cache within a chip multiprocessor can bring significant performance improvements. Resources are given to the applications that can benefit most from them, restricting each core to a number of logical cache ways. However, although overall performance is increased, existing schemes fail to consider energy saving when making their partitioning decisions. This...

chapter

Computational sprinting

Arun Raghavan, Yixin Luo, Anuj Chandawalla, Marios Papaefthymiou, more

IEEE International Symposium on High-Performance Comp Architecture > 1 - 12

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Although transistor density continues to increase, voltage scaling has stalled and thus power density is increasing each technology generation. Particularly in mobile devices, which have limited cooling options, these trends lead to a utilization wall in which sustained chip performance is limited primarily by power rather than area. However, many mobile applications do not demand sustained performance;...

INFONA - science communication portal

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)

Cache restoration for highly partitioned virtualized systems

Power balanced pipelines

Flexible register management using reference counting

Decoupled dynamic cache segmentation

Network congestion avoidance through Speculative Reservation

Accelerating business analytics applications

Architectural support for synchronization-free deterministic parallel programming

QuickIA: Exploring heterogeneous architectures on real prototypes

Architectural perspectives of future wireless base stations based on the IBM PowerEN™ processor

Statistical performance comparisons of computers

MACAU: A Markov model for reliability evaluations of caches under Single-bit and Multi-bit Upsets

Improving write operations in MLC phase change memory

System-level implications of disaggregated memory

TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture

Quasi-nonvolatile SSD: Trading flash memory nonvolatility to improve storage system performance for enterprise applications

SCD: A scalable coherence directory with flexible sharer set encoding

Dynamically heterogeneous cores through 3D resource pooling

JETC: Joint energy thermal and cooling management for memory and CPU subsystems in servers

Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs

Computational sprinting

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA)