2016 IEEE 34th International Conference on Computer Design (ICCD)

chapter

Process variations-aware resistive associative processor design

Hasan Erdem Yantir, Mohammed E. Fouda, Ahmed M. Eltawil, Fadi J. Kurdahi

2016 IEEE 34th International Conference on Computer Design (ICCD) > 49 - 55

Recent breakthroughs in memristive devices have demonstrated the potential of using resistive content addressable memories for associative processing. These architectures enable ultra-high density integrated circuits along with low-power computation. However, the reliability of memristive elements is limiting the widespread adoption of these architectures. In this study, we address the reliability...

chapter

Extending On-chip Interconnects for rack-level remote resource access

Yisong Chang, Ke Zhang, Sally A. McKee, Lixin Zhang, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 56 - 63

2016 IEEE 34th International Conference on Computer Design (ICCD)

The need to perform data analytics on exploding data volumes coupled with the rapidly changing workloads in cloud computing places great pressure on data-center servers. To improve hardware resource utilization across servers within a rack, we propose Direct Extension of On-chip Interconnects (DEOI), a high-performance and efficient architecture for remote resource access among server nodes. DEOI...

chapter

A strong arbiter PUF using resistive RAM within 1T-1R memory architecture

Rekha Govindaraj, Swaroop Ghosh

2016 IEEE 34th International Conference on Computer Design (ICCD) > 141 - 148

2016 IEEE 34th International Conference on Computer Design (ICCD)

Physically Unclonable Function (PUF) is cost effective and reliable security primitives widely used in authentication and in-place secret key generation. With growing research in the area of non-CMOS technologies for memories and circuits, it is important to understand their implications on the design of security primitives. Resistive Random Accessible Memory (RRAM) offers easy integration with CMOS...

chapter

Power-aware virtual machine mapping in the data-center-on-a-chip paradigm

Xue Lin, Yuankun Xue, Paul Bogdan, Yanzhi Wang, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 241 - 248

2016 IEEE 34th International Conference on Computer Design (ICCD)

It is projected that hundreds of cores can be integrated into a chip at the sub-20nm technology nodes. However, some challenges exist in the many-core architecture such as maintaining memory coherence, underutilized parallelism, and increased inter-core communication delay. This work proposes the data-center-on-a-chip (DCoC) paradigm employing virtualization technologies commonly used in today's data...

chapter

Tuning Stencil codes in OpenCL for FPGAs

Qi Jia, Huiyang Zhou

2016 IEEE 34th International Conference on Computer Design (ICCD) > 249 - 256

2016 IEEE 34th International Conference on Computer Design (ICCD)

OpenCL is designed as a parallel programming framework to support heterogeneous computing platforms. The implicit or explicit parallelism in OpenCL kernel code enables efficient FPGA implementation from a high-level programming abstraction. However, FPGA architecture is completely different from GPU architecture, for which OpenCL is widely used. Tuning OpenCL codes to achieve high performance on FPGAs...

chapter

Tolerating more hard errors in MLC PCMs using compression

Majid Jalili, Hamid Sarbazi-Azad

2016 IEEE 34th International Conference on Computer Design (ICCD) > 304 - 311

2016 IEEE 34th International Conference on Computer Design (ICCD)

Modern computer systems require fast, large and reliable memories to handle information explosion. With this goal in mind, not only deployment of main memories with new technologies are necessary, but also adopting innovative solutions for addressing newfound challenges must be considered as a priority. Recently, phase change memory (PCM) appeared as a preferred candidate for substituting DRAM. PCM...

chapter

Efficient processor allocation in a reconfigurable CMP architecture for dark silicon era

Fatemeh Aghaaliakbari, Mohaddeseh Hoveida, Mohammad Arjomand, Majid Jalili, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 336 - 343

2016 IEEE 34th International Conference on Computer Design (ICCD)

The continuance of Moore's law and failure of Dennard scaling force future chip multiprocessors (CMPs) to have considerable dark regions. How to use up available dark resources is an important concern for computer architects. In harmony with these changes, we must revise processor allocation schemes that severely affect the performance of a parallel on-chip system. A suitable allocation algorithm...

chapter

IACM: Integrated adaptive cache management for high-performance and energy-efficient GPGPU computing

Kyu Yeun Kim, Jinsu Park, Woongki Baek

2016 IEEE 34th International Conference on Computer Design (ICCD) > 380 - 383

2016 IEEE 34th International Conference on Computer Design (ICCD)

Hardware caches are widely employed in GPGPUs to achieve higher performance and energy efficiency. Incorporating hardware caches in GPGPUs, however, does not immediately guarantee enhanced performance and energy efficiency due to high cache contention and thrashing. To address the inefficiency of GPGPU caches, various adaptive techniques (e.g., warp limiting) have been proposed. However, relatively...

chapter

Quantifying the difference in resource demand among classic and modern NoC workloads

Amirhossein Mirhosseini, Mohammad Sadrosadati, Maryam Zare, Hamid Sarbazi-Azad

2016 IEEE 34th International Conference on Computer Design (ICCD) > 404 - 407

2016 IEEE 34th International Conference on Computer Design (ICCD)

This paper quantifies the difference in resource demand between modern and classic NoC workloads. In the paper, we show that modern workloads are able to better utilize higher numbers of VCs and smaller C factors in order to attain performance and energy efficiency. This is because of the high throughput and possible local congestions in their traffic pattern. As a result, such workloads are more...

chapter

Parallelizing Latent Semantic Indexing using an FPGA-based architecture

Xinying Wang, Joseph Zambreno

2016 IEEE 34th International Conference on Computer Design (ICCD) > 432 - 435

2016 IEEE 34th International Conference on Computer Design (ICCD)

Latent Semantic Indexing (LSI) has played a significant role in discovering patterns on the relationships between query terms and unstructured documents. However, the inherent characteristics of complex matrix factorization in LSI make it difficult to meet stringent performance requirements. In this paper, we present a deeply pipelined reconfigurable architecture for LSI, which parallelizes the matrix...

chapter

A heterogeneous low-cost and low-latency Ring-Chain network for GPGPUs

Xia Zhao, Sheng Ma, Chen Li, Lieven Eeckhout, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 472 - 479

2016 IEEE 34th International Conference on Computer Design (ICCD)

To achieve high throughput, core count in compute accelerators such as General-Purpose Graphics Processing Units (GPGPUs) increases continuously. The communication demand of these cores boosts the demand for a low-latency packet switched network. As packet latency is mainly composed of per-hop latency, contention latency and serialization latency, a favorable Network-on-Chip (NoC) design should efficiently...

chapter

Generating efficient and high-quality pseudo-random behavior on Automata Processors

Jack Wadden, Nathan Brunelle, Ke Wang, Mohamed El-Hadedy, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 622 - 629

2016 IEEE 34th International Conference on Computer Design (ICCD)

Micron's Automata Processor (AP) efficiently emulates non-deterministic finite automata and has been shown to provide large speedups over traditional von Neumann execution for massively parallel, rule-based, data-mining and pattern matching applications. We demonstrate the AP's ability to generate high-quality and energy efficient pseudo-random behavior for use in pseudo-random number generation or...

chapter

×86 computer architecture simulators: A comparative study

Ayaz Akram, Lina Sawalha

2016 IEEE 34th International Conference on Computer Design (ICCD) > 638 - 645

2016 IEEE 34th International Conference on Computer Design (ICCD)

The significance of computer architecture simulators in advancing computer architecture research is widely acknowledged. Computer architects have developed numerous simulators in the past few decades and their number continues to rise. This paper explores different simulation techniques and surveys many ×86 simulators. Comparing simulators with each other and validating their correctness has been...

chapter

DLL: A dynamic latency-aware load-balancing strategy in 2.5D NoC architecture

Chen Li, Sheng Ma, Lu Wang, Zicong Wang, more

2016 IEEE 34th International Conference on Computer Design (ICCD) > 646 - 653

2016 IEEE 34th International Conference on Computer Design (ICCD)

As the 3D stacking technology still faces several challenges, the 2.5D stacking technology gains better application prospects nowadays. With the silicon interposer, the 2.5D stacking can improve the bandwidth and capacity of the memory system. To satisfy the communication requirements of the integrated memory system, the free routing resources in the interposer should be explored to implement an additional...

INFONA - science communication portal

2016 IEEE 34th International Conference on Computer Design (ICCD)

Process variations-aware resistive associative processor design

Extending On-chip Interconnects for rack-level remote resource access

A strong arbiter PUF using resistive RAM within 1T-1R memory architecture

Power-aware virtual machine mapping in the data-center-on-a-chip paradigm

Tuning Stencil codes in OpenCL for FPGAs

Tolerating more hard errors in MLC PCMs using compression

Efficient processor allocation in a reconfigurable CMP architecture for dark silicon era

IACM: Integrated adaptive cache management for high-performance and energy-efficient GPGPU computing

Quantifying the difference in resource demand among classic and modern NoC workloads

Parallelizing Latent Semantic Indexing using an FPGA-based architecture

A heterogeneous low-cost and low-latency Ring-Chain network for GPGPUs

Generating efficient and high-quality pseudo-random behavior on Automata Processors

×86 computer architecture simulators: A comparative study

DLL: A dynamic latency-aware load-balancing strategy in 2.5D NoC architecture

Filter options

Publication date

Keywords

INFONA - science communication portal

2016 IEEE 34th International Conference on Computer Design (ICCD) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2016 IEEE 34th International Conference on Computer Design (ICCD)