Search results

Items from 1 to 13 out of 13 results

chapter

Prefetching for cloud workloads: An analysis based on address patterns

Jiajun Wang, Reena Panda, Lizy Kurian John

2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) > 163 - 172

2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Cloud computing is gaining popularity due to its ability to provide infrastructure, platform and software services to clients on a global scale. Using cloud services, clients reduce the cost and complexity of buying and managing the underlying hardware and software layers. Popular services like web search, data analytics and data mining typically work with big data sets that do not fit into top level...

chapter

An Adaptive Pointer-Chasing Data Prefetching Strategy Based on Phased Memory Behavior Functions

Huang Yan, Li Yuhua

2016 6th International Conference on Digital Home (ICDH) > 100 - 103

2016 6th International Conference on Digital Home (ICDH)

Traditional data prefetching techniques' effectiveness reduces in dealing with the complex structure of pointer-chasing data applications decreased its effectiveness. To solve this problem, an adaptive pointer-chasing data prefetching strategy is proposed based on the runtime phased memory behavior. This strategy aims at researching the effective data prefetching scheduling mechanism in the light...

chapter

Branch prediction and the performance of interpreters — Don't trust folklore

Erven Rohou, Bharath Narasimha Swamy, Andre Seznec

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 103 - 114

2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

Interpreters have been used in many contexts. They provide portability and ease of development at the expense of performance. The literature of the past decade covers analysis of why interpreters are slow, and many software techniques to improve them. A large proportion of these works focuses on the dispatch loop, and in particular on the implementation of the switch statement: typically an indirect...

chapter

Loop-Aware Memory Prefetching Using Code Block Working Sets

Adi Fuchs, Shie Mannor, Uri Weiser, Yoav Etsion

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 533 - 544

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

Memory prefetchers predict streams of memory addresses that are likely to be accessed by recurring invocations of a static instruction. They identify an access pattern and prefetch the data that is expected to be accessed by pending invocations of the said instruction. A stream, or a prefetch context, is thus typically composed of a trigger instruction and an access pattern. Recurring code blocks,...

chapter

Scalability Analysis of Signatures in Transactional Memory Systems

Ricardo Quislant, Eladio Gutierrez, Oscar Plata

2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing > 128 - 135

2014 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

Signatures have been proposed in transactional memory systems to represent read and write sets and to decouple transaction conflict detection from private caches or to accelerate it. Generally, signatures are implemented as Bloom filters that allow unbounded read/write sets to be summarized in bounded space at the cost of false conflict detection. It is known that this behavior has great impact in...

chapter

Nostradamus: Low-cost hardware-only error detection for processor cores

Ralph Nathan, Daniel J. Sorin

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 1 - 6

2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)

We propose a new, low-cost, hardware-only scheme to detect errors in superscalar, out-of-order processor cores. For each instruction decoded, Nostradamus compares what the instruction is expected to do against what the instruction actually does. We implement Nostradamus in RTL on top of a baseline superscalar, out-of-order core, and we experimentally evaluate its ability to detect injected errors...

chapter

An energy-efficient branch prediction technique via global-history noise reduction

Zichao Xie, Dong Tong, Xu Cheng

International Symposium on Low Power Electronics and Design (ISLPED) > 211 - 216

2013 IEEE International Symposium on Low Power Electronics and Design (ISLPED)

Accurate branch prediction can improve processor performance, while reducing energy waste. Though some existing branch predictors have been proved effective, they usually require large amount of storage or complicate the processor front-end. This paper proposes a novel branch prediction technique called History Artificially Selected (HAS) prediction. It is a hardware technique that bases on the existing...

chapter

Control Independence Using Dual Renaming

Lin Meng, Shigeru Oyanagi

2010 First International Conference on Networking and Computing > 264 - 267

2010 First International Conference on Networking and Computing (ICNC 2010)

Modern Super scalar Processor squashes up all of wrong-path instructions when the branch prediction misses. In deeper pipelines, branch miss prediction penalty increases seriously owing to large number of squashed instructions. Exploiting control independence has been proposed for reducing this penalty. Control Independence method reuses control independent instructions (CI instructions) without squashing...

chapter

Locality-aware adaptive grain signatures for Transactional Memories

Woojin Choi, Jeff Draper

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Transactional Memory (TM) has attracted considerable attention because it promises to increase programmer productivity by making it easier to write correct parallel programs. To maintain correctness in the face of concurrency, detecting conflicts among simultaneously running transactions is an essential element. Hardware signatures have been proposed as an area-efficient mechanism for conflict detection...

chapter

Spotlight - a low complexity highly accurate profile-based branch predictor

S. Verma, B. Maderazo, D.M. Koppelman

2009 IEEE 28th International Performance Computing and Communications Conference > 239 - 247

2009 IEEE 28th International Performance Computing and Communications Conference (IPCCC 2009)

In an effort to achieve the high prediction accuracy needed to attain high instruction throughputs, branch predictors proposed in the literature and used in real systems have become increasingly more complex and larger over time. This is not consistent with the anticipated trend of simpler and more numerous cores in future multi-core processors. We introduce the Spotlight branch predictor, a novel...

chapter

H-NMRU: A Low Area, High Performance Cache Replacement Policy for Embedded Processors

Sourav Roy

2009 22nd International Conference on VLSI Design > 553 - 558

2009 22nd International Conference on VLSI Design

We propose a low area, high performance cache replacement policy for embedded processors called Hierarchical Non-Most-Recently-Used (H-NMRU). The H-NMRU is a parameterizable policy where we can trade-off performance with area. We extended the Dinero cache simulator with the H-NMRU policy and performed architectural exploration with a set of cellular and multimedia benchmarks. On a 16 way cache, a...

chapter

The Design of Way-Prediction Scheme in Set-Associative Cache for Energy Efficient Embedded System

Chia-Ying Tseng, Hsin-Chu Chen

2009 WRI International Conference on Communications and Mobile Computing > 3 > 3 - 7

2009 WRI International Conference on Communications and Mobile Computing. CMC 2009

Embedded system develops rapidly, functions turn into more complicate, and multi-media applications are growing daily and they consume more electrical power. Therefore, how to improve stand-by time will become a very important issue. Related researches indicate that the power consumption of processor cache is accounted for a big proportion. Way-prediction and LRU (least recently used) algorithms improve...

chapter

Instruction prefetching using Basicblock prediction

K. Shyamala, P. Ravibabu, S.K. Lokhande, R. Reddy, more

2008 International Conference on Electronic Design > 1 - 4

2008 International Conference on Electronic Design. ICED 2008

Memory latency is a significant bottleneck in modern computer architectures, especially for commercial and multimedia applications. Instruction cache misses can severely limit the performance, due to advent of superscalar processors and multicore systems. Prefetching is one of the promising method to bridge the performance gap between CPU and DRAM speed. Although Instruction prefetching is a promising...

Filter options

Keywords:
HARDWARE
BENCHMARK TESTING

Publication date

Set your own date range

Keywords

INDEXES (4)
PREFETCHING (4)
CACHE STORAGE (3)
MICROPROCESSOR CHIPS (3)
RADIATION DETECTORS (3)
REGISTERS (3)
COMPUTER ARCHITECTURE (2)
CONFLICT DETECTION (2)
CORRELATION (2)
HARDWARE TRANSACTIONAL MEMORY (2)
INSTRUCTION SETS (2)
PIPELINES (2)
PROGRAM PROCESSORS (2)
ACCURACY (1)
ADAPTIVE GRAIN SIGNATURE (1)
ALGORITHMS (1)
AREA-EFFICIENT MECHANISM (1)
ARRAYS (1)
ASYMMETRIC (1)
BASICBLOCK (1)
BASICBLOCK INSTRUCTION PREFETCHING (1)
BASICBLOCK INSTRUCTION PREFETCHING (BIP) (1)
BLOOM FILTER (1)
BRANCH MISS PREDICTION PENALTY (1)
BRANCH PREDICTION (1)
BRIDGES (1)
CACHE (1)
CACHE MISS RATE (1)
CACHE REPLACEMENT POLICY (1)
CBWS (1)
CELLULAR BENCHMARKS (1)
CLOUD COMPUTING (1)
COMPLEXITY THEORY (1)
COMPUTER ARCHITECTURES (1)
CONCURRENT COMPUTING (1)
CONTROL INDEPENDENCE (1)
CONTROL INDEPENDENT INSTRUCTION (1)
CRYPTOGRAPHY (1)
DATA DEPENDENCY (1)
DIFFERENTIALS (1)
DIGITAL CIRCUITS (1)
DINERO CACHE SIMULATOR (1)
DRAM (1)
DRAM CHIPS (1)
DUAL RENAMING (1)
EMBEDDED PROCESSORS (1)
EMBEDDED SYSTEM (1)
EMBEDDED SYSTEMS (1)
ENERGY EFFICIENCY (1)
ENERGY EFFICIENT (1)
ENERGY EFFICIENT EMBEDDED SYSTEM (1)
ENGINES (1)
H-NMRU (1)
HARDWARE COMPLEXITY (1)
HARDWARE SIGNATURE (1)
HASHED PERCEPTRON (1)
HIERARCHICAL NON-MOST-RECENTLY-USED (1)
HIGH PERFORMANCE CACHE REPLACEMENT POLICY (1)
HISTORY NOISE (1)
INSTRUCTION CACHE MISS (1)
LEAST RECENTLY USED ALGORITHM (1)
LOCALITY (1)
LOCALITY-AWARE ADAPTIVE GRAIN SIGNATURE (1)
LOGIC DESIGN (1)
LOOPS (1)
LOW-POWER ELECTRONICS (1)
MEMORY ACCESS PATTERN (1)
MEMORY LATENCY (1)
MICROARCHITECTURE (1)
MODIFIED PSEUDO LRU REPLACEMENT ALGORITHM (1)
MOST RECENTLY USED ALGORITHM (1)
MRU ALGORITHM (1)
MULTICORE SYSTEMS (1)
MULTIMEDIA APPLICATION (1)
MULTIMEDIA BENCHMARKS (1)
MULTIPLEXING (1)
MULTIPROCESSING SYSTEMS (1)
MULTISET (1)
NOISE (1)
OUT OF ORDER (1)
PARALLEL PROCESSING (1)
PATH-BASED NEURAL PREDICTOR (1)
PERFORMANCE EVALUATION (1)
PIPELINE PROCESSING (1)
PISA (1)
POINTER-CHASING DATA (1)
PORTABLE INSTRUCTION SET ARCHITECTURE (1)
PORTABLE INSTRUCTION SET ARCHITECTURE (PISA) (1)
POWER CONSUMPTION REDUCTION (1)
POWER DEMAND (1)
PREDICTION ALGORITHMS (1)
PREFETCH (1)
PREFETCHER (1)
PROFILE-BASED BRANCH PREDICTOR (1)
PROGRAM COMPILERS (1)
QUEUEING ANALYSIS (1)
RE-RENAMING MECHANISM (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options