Search results

Items from 81 to 100 out of 567 results

chapter

ReMAP: A Reconfigurable Heterogeneous Multicore Architecture

M A Watkins, D H Albonesi

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture > 497 - 508

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010)

This paper presents ReMAP, a reconfigurable architecture geared towards accelerating and parallelizing applications within a heterogeneous CMP. In ReMAP, threads share a common reconfigurable fabric that can be configured for individual thread computation or fine-grained communication with integrated computation. The architecture supports both fine-grained point-to-point communication for pipeline...

chapter

Throughput-Effective On-Chip Networks for Manycore Accelerators

A Bakhoda, J Kim, T M Aamodt

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture > 421 - 432

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010)

As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network design. This paper explores throughput-effective network-on-chips (NoC) for future manycore accelerators that employ bulk-synchronous parallel (BSP) programming models such as CUDA and OpenCL. A hardware optimization is "throughput-effective"...

chapter

Flexible and Efficient Instruction-Grained Run-Time Monitoring Using On-Chip Reconfigurable Fabric

D Y Deng, D Lo, G Malysa, S Schneider, more

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture > 137 - 148

2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010)

This paper proposes Flex Core, a hybrid processor architecture where an on-chip reconfigurable fabric (FPGA) is tightly coupled with the main processing core. Flex Core provides an efficient platform that can support a broad range of run-time monitoring and bookkeeping techniques. Unlike using custom hardware, which is more efficient but often extremely difficult and expensive to incorporate into...

chapter

Open Source Precision Timed Soft Processor for Cyber Physical System Applications

S Craven, D Long, J Smith

2010 International Conference on Reconfigurable Computing and FPGAs > 448 - 451

2010 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2010)

Modern processor architectures sacrifice timing predictability to improve average performance. Branch prediction, out-of-order execution, and multi-level cache hierarchies complicate accurate execution time estimates. The timing demands of Cyber Physical Systems (CPS) have led some to propose new processor architectures, including Precision Timed (PRET) processors, which simplify analysis of execution...

chapter

Operating System Structures for Multiprocessor Systems on Programmable Chip

Miaoqing Huang, David Andrews, Jason Agron

2010 International Conference on Reconfigurable Computing and FPGAs > 358 - 363

2010 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2010)

Chips are moving from single-core systems to much more complex, heterogeneous many core systems. While heterogeneous architectures promise high performance, they are also challenging our ability to port our existing operating systems to abstract the heterogeneous components into a unified architecture. Baseline solutions to resolve heterogeneity issues within many cores use Remote Procedure Calls...

chapter

A Minimalistic Architecture for Reconfigurable WFS-Based Immersive-Audio

D Theodoropoulos, G Kuzmanov, G Gaydadjiev

2010 International Conference on Reconfigurable Computing and FPGAs > 1 - 6

2010 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2010)

We propose a minimalistic processor architecture tailoring Wave Field Synthesis (WFS)-based audio applications to configurable hardware. Eleven high-level instructions provide the required flexibility for embedded WFS customization. We describe the implementation of the proposed instructions and apply them to a multi-core reconfigurable WFS architecture. Our approach combines software programming...

chapter

Characterization of Scientific and Transactional Applications under Multi-core Architectures on Cloud Computing Environment

D R Ogura, E T Midorikawa

2010 13th IEEE International Conference on Computational Science and Engineering > 314 - 320

2010 IEEE 13th International Conference on Computational Science and Engineering (CSE 2010)

Cloud Computing is one of the hottest topics researched today, with the objective of taking advantage of data center computational resources. Hardware and software virtualization make the environment scalable, redundant, and lower cost. This paper intends to characterize scientific and transactional applications in Cloud infrastructures IaaS, identifying the best virtual machine configuration in terms...

chapter

Memory-Aware Optimal Scheduling with Communication Overhead Minimization for Streaming Applications on Chip Multiprocessors

Yi Wang, Duo Liu, Zhiwei Qin, Zili Shao

2010 31st IEEE Real-Time Systems Symposium > 350 - 359

2010 IEEE 31st Real-Time Systems Symposium (RTSS 2010)

In this paper, we focus on solving the problem of removing inter-core communication overhead for streaming applications on chip multiprocessors. The objective is to totally remove inter-core communication overhead while minimizing the overall memory usage. By totally removing inter-core communication overhead, a shorter period can be applied and system throughput can be improved. Our basic idea is...

chapter

A dynamically programmable radio processing MPSoC with hardware-based task management

O Sarode, Z Miljanic, P Spasojevic

2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers > 1264 - 1268

2010 44th Asilomar Conference on Signals, Systems and Computers

We propose a programmable heterogeneous multi-processor system-on-chip (MPSoC) platform architecture for flexible radio processing that aims at striking a balance between performance (as provided by ASICs) and flexibility (as provided by SDR). Based on a novel hardware-oriented Virtual Flow Pipelining (VFP) framework, the key highlights of this solution are a simple task-level programming model for...

chapter

Advanced SystemBuilder: A tool set for multiprocessor design space exploration

Seiya Shibata, Shinya Honda, Hiroyuki Tomiyama, Hiroaki Takada

2010 International SoC Design Conference > 79 - 82

2010 International SoC Design Conference (ISOCC 2010)

This paper presents our integrated system-level design tool set, named Advanced SystemBuilder. Advanced SystemBuilder supports overall methodology for system design and design space exploration, and provides programming model of systems, automatic synthesis capabilities for FPGA-based prototyping, cosimulation and execution profiling. A case study of MPEG4 decoder design shows the effectiveness of...

chapter

Time-predictable chip-multiprocessor design

M Schoeberl

2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers > 2116 - 2120

2010 44th Asilomar Conference on Signals, Systems and Computers

Real-time systems need time-predictable platforms to enable static worst-case execution time (WCET) analysis. Improving the processor performance with superscalar techniques makes static WCET analysis practically impossible. However, most real-time systems are multi-threaded applications and performance can be improved by using several processor cores on a single chip. In this paper we present a time-predictable...

chapter

NoC-based CSP support for a Java chip multiprocessor

F Gruian, M Schoeberl

NORCHIP 2010 > 1 - 6

2010 28th Norchip Conference (NORCHIP 2010)

In this paper we examine the idea of implementing communicating sequential processes (CSP) constructs on a Java embedded chip multiprocessor (CMP). The approach is intended to reduce the memory bandwidth pressure on the shared memory, by employing a dedicated network-on-chip (NoC). The presented solution is scalable and also specific for our limited resources and real-time predictability requirements...

chapter

Multi-application multi-step mapping method for many-core Network-on-Chips

Bo Yang, Liang Guang, T C Xu, A W Yin, more

NORCHIP 2010 > 1 - 6

2010 28th Norchip Conference (NORCHIP 2010)

Massive parallel computing performed on many-core Network-on-Chips (NoCs) is the future of the computing. One feasible approach to implement parallel computing is to deploy multiple applications on the NoC simultaneously. In this paper, we propose a multi-application mapping method starting with the application mapping which finds a region on the NoC for each application and then task mapping which...

chapter

A novel multi-core processor for communication applications

Ruijin Xiao, Heng Quan, Kaidi You, Bei Huang, more

2010 10th IEEE International Conference on Solid-State and Integrated Circuit Technology > 236 - 238

2010 10th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT)

This paper proposes a novel multi-core processor with SIMD(Single Instruction Multiple Data) ISA (Instruction Set Architecture) and extended register file for communication applications. To acquire better parallel computing capability, we implement SIMD ISA and increase the number of register file from 32 to 64. 5×5 homogeneous 2-D mesh NoC (Network-on-Chip) topology is adopted to further enhance...

chapter

The design and implementation of two-cycle NoC router

Qi Shubo, Jinwen Li, Tianlei Zhao, Xiaomin Jia, more

2010 10th IEEE International Conference on Solid-State and Integrated Circuit Technology > 233 - 235

2010 10th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT)

With the number of processor cores increasing in chip multi-processors (CMPs) and global wire delays increasing, networks on chip have been gaining wide acceptance for on-chip inter-core communication. This paper introduces a low latency Dynamic Virtual Output Queues Router (DVOQR), which can reduce the router latency to two cycles by leveraging look-ahead routing computation and virtual output address...

chapter

Parallel instruction set extension identification

Daniel Shapiro, Michael Montcalm, Miodrag Bolic

2010 IEEE 26-th Convention of Electrical and Electronics Engineers in Israel > 535 - 539

2010 IEEE 26th Convention of Electrical & Electronics Engineers in Israel (IEEEI 2010)

Modern embedded processors are often customized to accelerate native code. However, the design space exploration of hardware/software trade-offs is often time-intensive. To explore the design space of a processor's instruction set, simulations are utilized. Instruction set extension identification is usually performed by analyzing the basic blocks of an application in a linear fashion. We present...

chapter

Temperature aware power optimization for multicore floating-point units

Wei Liu, A Nannarelli

2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers > 1134 - 1138

2010 44th Asilomar Conference on Signals, Systems and Computers

Fused Multiply-Add (FMA) units are quite popular in floating-point execution units in state-of-the-art multicore processors. It has been shown that, for division operations, using digit-recurrence units consumes much less power and energy than using FMA units which are based on Newton-Raphson approximation algorithms. In this work, we show that digit-recurrence division units can also reduce on chip...

chapter

Accelerating I/O Forwarding in IBM Blue Gene/P Systems

V Vishwanath, M Hereld, Kamil Iskra, Dries Kimpe, more

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 10

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. I/O forwarding is a paradigm that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines to meet the requirements of data-intensive applications by shipping I/O calls from compute nodes to dedicated...

chapter

Cloth Simulation Using AABB Hierarchies and GPU Parallelism

Frizzi San Roman Salazar, B B Machado, Alexander Ocsa, M C F de Oliveira

2010 Brazilian Symposium on Games and Digital Entertainment > 97 - 107

2010 Brazilian Symposium on Games and Digital Entertainment (SBGAMES 2010)

Providing realistic, high-resolution and high fidelity representation of motions ia essential in the cloth simulation problem. In order to make high resolution simulations tractable, several algorithms have been developed that manage cloth-object interactions efficiently through specialized data structures such as AABB trees. However, implementation restrictions on single CPU architectures impose...

chapter

Title Page i

2010 First International Conference on Networking and Computing > i

2010 First International Conference on Networking and Computing (ICNC 2010)

The following topics are dealt with: peer-to-peer system; screen space ambient occlusion; cognitive radio network; computer architecture; virtual channel allocation; performance oriented system; FPGA; hardware design; many-core processor; dependable acceleration systems; photonic network-on-chip; parallel algorithm; distributed algorithm; CUDA GPU; Canny edge detection; interconnection network; and...

Data set:
ieee
Keywords:
COMPUTER ARCHITECTURE
MULTIPROCESSING SYSTEMS

Publication date

Set your own date range

Content availability

Available (555)
None (12)

Publication type

book (471)
article (96)

Keywords

HARDWARE (157)
PROGRAM PROCESSORS (122)
MICROPROCESSOR CHIPS (107)
SYSTEM-ON-CHIP (94)
PARALLEL PROCESSING (85)
FIELD PROGRAMMABLE GATE ARRAYS (83)
COMPUTATIONAL MODELING (80)
PARALLEL ARCHITECTURES (72)
EMBEDDED SYSTEMS (65)
MICROPROCESSORS (62)
MAGNETIC CORES (61)
REGISTERS (60)
SOFTWARE (58)
NETWORK-ON-CHIP (54)
PERFORMANCE EVALUATION (52)
INSTRUCTION SETS (50)
BENCHMARK TESTING (49)
MULTICORE PROCESSING (49)
MULTI-THREADING (48)
SYSTEM-ON-A-CHIP (48)
ALGORITHM DESIGN AND ANALYSIS (42)
BANDWIDTH (42)
PROCESSOR SCHEDULING (40)
FPGA (38)
KERNEL (36)
SCHEDULING (35)
CACHE STORAGE (34)
RECONFIGURABLE ARCHITECTURES (34)
OPTIMIZATION (33)
PARALLEL PROGRAMMING (33)
PROTOCOLS (33)
SYNCHRONIZATION (33)
CLOCKS (32)
DATA MINING (32)
REAL TIME SYSTEMS (32)
MPSOC (31)
SWITCHES (31)
COPROCESSORS (30)
LOGIC DESIGN (29)
PIPELINES (28)
PROGRAMMING (28)
DECODING (27)
DELAY (27)
INTEGRATED CIRCUIT DESIGN (27)
PIPELINE PROCESSING (27)
RESOURCE MANAGEMENT (27)
THROUGHPUT (27)
LIBRARIES (26)
RANDOM ACCESS MEMORY (26)
RESOURCE ALLOCATION (25)
ROUTING (24)
COMPUTERS (23)
MULTICORE ARCHITECTURE (23)
MEMORY ARCHITECTURE (22)
MULTICORE PROCESSOR (22)
OPERATING SYSTEMS (21)
PROGRAM COMPILERS (21)
RUNTIME (21)
YARN (21)
ENERGY CONSUMPTION (20)
MULTI-CORE (20)
MULTICORE ARCHITECTURES (20)
COMPLEXITY THEORY (19)
CRYPTOGRAPHY (19)
DIGITAL SIGNAL PROCESSING (19)
GRAPHICS PROCESSING UNIT (19)
TILES (18)
MULTICORE (17)
MULTICORE PROCESSORS (17)
MULTIPROCESSOR INTERCONNECTION NETWORKS (17)
SPACE EXPLORATION (17)
DESIGN SPACE EXPLORATION (16)
EMBEDDED SYSTEM (16)
HARDWARE-SOFTWARE CODESIGN (16)
MEMORY MANAGEMENT (16)
MULTIPROCESSOR SYSTEM-ON-CHIP (16)
POWER AWARE COMPUTING (16)
POWER DEMAND (16)
TIMING (16)
ACCELERATION (15)
ANALYTICAL MODELS (15)
APPLICATION PROGRAM INTERFACES (15)
CONCURRENT COMPUTING (15)
ENGINES (15)
OPERATING SYSTEMS (COMPUTERS) (15)
POWER CONSUMPTION (15)
SCHEDULES (15)
SERVERS (15)
APPLICATION SOFTWARE (14)
LINUX (14)
PARALLEL ALGORITHMS (14)
PARALLEL MACHINES (14)
PROCESS CONTROL (14)
VLIW (14)
DESIGN METHODOLOGY (13)
ENERGY EFFICIENCY (13)
FAULT TOLERANCE (13)
HIGH PERFORMANCE COMPUTING (13)
more

INFONA - science communication portal

Search results

ReMAP: A Reconfigurable Heterogeneous Multicore Architecture

Throughput-Effective On-Chip Networks for Manycore Accelerators

Flexible and Efficient Instruction-Grained Run-Time Monitoring Using On-Chip Reconfigurable Fabric

Open Source Precision Timed Soft Processor for Cyber Physical System Applications

Operating System Structures for Multiprocessor Systems on Programmable Chip

A Minimalistic Architecture for Reconfigurable WFS-Based Immersive-Audio

Characterization of Scientific and Transactional Applications under Multi-core Architectures on Cloud Computing Environment

Memory-Aware Optimal Scheduling with Communication Overhead Minimization for Streaming Applications on Chip Multiprocessors

A dynamically programmable radio processing MPSoC with hardware-based task management

Advanced SystemBuilder: A tool set for multiprocessor design space exploration

Time-predictable chip-multiprocessor design

NoC-based CSP support for a Java chip multiprocessor

Multi-application multi-step mapping method for many-core Network-on-Chips

A novel multi-core processor for communication applications

The design and implementation of two-cycle NoC router

Parallel instruction set extension identification

Temperature aware power optimization for multicore floating-point units

Accelerating I/O Forwarding in IBM Blue Gene/P Systems

Cloth Simulation Using AABB Hierarchies and GPU Parallelism

Title Page i

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options