Search results

Items from 1 to 12 out of 12 results

chapter

Parallel Responsive Task on Dependable Responsive Multithreaded Processor II

Hiroyuki Chishiro, Yusuke Hatori, Kohei Osawa, Keigo Mizotani, more

2016 IEEE 4th International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA) > 89 - 94

2016 IEEE 4th International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA)

Cyber-Physical Systems (CPS) are tight integrations of computational and physical worlds for various kinds of applications. For example, a humanoid robot, which is a typical application of CPS, has required timing constraints, low-latency execution, and parallel processing to achieve fine-grained real-time execution. Therefore low-latency parallel real-time computing is an important factor for CPS...

chapter

Fast and Scalable Thread Migration for Multi-core Architectures

Miguel Rodrigues, Nuno Roma, Pedro Tomas

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing > 9 - 16

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing (EUC)

Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based...

chapter

Painless Parallelism on Heterogeneous Hardware Leveraging the Functional Paradigm

Mauro Blanco, Pablo Perdomo, Pablo Ezzatti, Alberto Pardo, more

2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW) > 73 - 78

2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)

We use a functional framework designed for parallel programming with linear algebra applications to leverage the computing power of heterogeneous hardware. Our work is performed in the context of the pure functional programming language Haskell. The framework allows the manipulation of arbitrary representations for matrices and the definition of multiple implementations of BLAS operations based on...

chapter

Assessing Safe Task Parallelism in SPEC 2006 INT

Tongxin Bai, Chen Ding, Pengcheng Li

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing > 402 - 411

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

To migrate complex sequential code to multicore, profiling is often used on sequential executions to find opportunities for parallelization. In non-scientific code, the potential parallelism often resides in while-loops rather than for-loops. The do-all model used in the past by many studies cannot detect this type of parallelism. A new, task-based model has been used by a number of recent studies...

chapter

PAMI: A Parallel Active Message Interface for the Blue Gene/Q Supercomputer

Sameer Kumar, Amith R. Mamidala, Daniel A. Faraj, Brian Smith, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium > 763 - 773

2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

The Blue Gene/Q machine is the next generation in the line of IBM massively parallel supercomputers, designed to scale to 262144 nodes and sixteen million threads. With each BG/Q node having 68 hardware threads, hybrid programming paradigms, which use message passing among nodes and multi-threading within nodes, are ideal and will enable applications to achieve high throughput on BG/Q. With such unprecedented...

chapter

HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor

Sanghoon Lee, D Tiwari, Y Solihin, J Tuck

2011 IEEE 17th International Symposium on High Performance Computer Architecture > 99 - 110

2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA)

Queues are commonly used in multithreaded programs for synchronization and communication. However, because software queues tend to be too expensive to support finegrained parallelism, hardware queues have been proposed to reduce overhead of communication between cores. Hardware queues require modifications to the processor core and need a custom interconnect. They also pose difficulties for the operating...

chapter

Reconfigurable parallel computing

Dietmar Tutsch

2010 First International Conference On Parallel, Distributed and Grid Computing (PDGC 2010) > 5

2010 1st International Conference on Parallel, Distributed and Grid Computing (PDGC 2010)

Summary form only given. The dynamic reconfiguration of hardware stands for the change of hardware while the system is operating. Its benefit is the adaption to different computing requirements. For instance, an improved use of communication networks can be achieved: Many networks reveal the characteristic that connections between specific communication partners show a smaller latency than others...

chapter

FPGA-based adaptive computing for correlated multi-stream processing

Ming Liu, Zhonghai Lu, Wolfgang Kuehn, Axel Jantsch

2010 Design, Automation&Test in Europe Conference&Exhibition (DATE 2010) > 973 - 976

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)

In conventional static implementations for correlated streaming applications, computing resources may be in-efficiently utilized since multiple stream processors may supply their sub-results at asynchronous rates for result correlation or synchronization. To enhance the resource utilization efficiency, we analyze multi-streaming models and implement an adaptive architecture based on FPGA Partial Reconfiguration...

chapter

Leakage power reduction for coarse-grained dynamically reconfigurable processor arrays using Dual Vt cells

K. Hirai, M. Kato, Y. Saito, H. Amano

2009 International Conference on Field-Programmable Technology > 104 - 111

2009 International Conference on Field-Programmable Technology (FPT 2009)

One of benefit of coarse-grained dynamically reconfigurable processor arrays (DRPAs) is their low dynamic power consumption by operating a number of processing element (PE) in parallel with a low frequency clock. However, in the future advanced process, the leakage power will occupy a considerable part of the total power consumption, and it may degrade the advantage of DRPAs. In order to reduce the...

chapter

Efficient Virtualization of High-Performance Network Interfaces

H. Froning, H. Litz, U. Bruning

2009 Eighth International Conference on Networks > 434 - 439

2009 Eighth International Conference on Networks. ICN 2009

The architecture of modern computing systems is getting more and more parallel, in order to exploit more of the offered parallelism by applications and to increase the system's overall performance. This includes multiple cores per processor module, multi-threading techniques and the resurgence of interest in virtual machines. In spite of this amount of parallelism the network interface is typically...

chapter

Distributed Contextual Data Fusion with ACIPL

M.A. McGrath, Y.F. Zheng

2008 IEEE National Aerospace and Electronics Conference > 337 - 342

2008 IEEE National Aerospace and Electronics Conference. NAECON 2008

A system for controlling smart sensor networks is described. The system is called the adaptive context information processing language (ACIPL) which will allow explicit use of states of context inferred from sensor readings and algorithmic output for distributed control of data fusion in sensor networks. The detailed description of the language including its use for sensor information separation into...

chapter

Empirical evaluation of the CRAY-T3D: a compiler perspective

R.H. Arpaci, D.E. Culler, A. Krishnamurthy, S.G. Steinberg, more

Proceedings 22nd Annual International Symposium on Computer Architecture > 320 - 331

Proceedings 22nd Annual International Symposium on Computer Architecture

Most recent MPP systems employ a fast microprocessor surrounded by a shell of communication and synchronization logic. The CRAY-T3D provides an elaborate shell to support global-memory access, prefetch, atomic operations, barriers, and block transfers. We provide a detailed empirical performance characterization of these primitives using micro-benchmarks and evaluate their utility in compiling for...

Filter options

Keywords:
CONTEXT
HARDWARE
PARALLEL PROCESSING

Publication date

Set your own date range

Keywords

COMPUTER ARCHITECTURE (5)
INSTRUCTION SETS (3)
RECONFIGURABLE ARCHITECTURES (3)
REGISTERS (3)
ACCELERATION (2)
COMPUTATIONAL MODELING (2)
COMPUTER SCIENCE (2)
FIELD PROGRAMMABLE GATE ARRAYS (2)
MULTI-THREADING (2)
MULTIPROCESSING SYSTEMS (2)
SYNCHRONIZATION (2)
SYSTEM-ON-CHIP (2)
1ROW DESIGN (1)
ACIPL (1)
ADAPTATION MODEL (1)
ADAPTIVE CONTEXT INFORMATION PROCESSING LANGUAGE (1)
ARM (1)
ARRAYS (1)
ATOMIC OPERATIONS (1)
AUTOMATION (1)
BARRIERS (1)
BLAS (1)
BLOCK TRANSFERS (1)
BLUE GENE (1)
BULK TRANSFER ENGINE (1)
CALLING CONTEXT TREE (1)
CHIP MULTIPROCESSOR (1)
COARSE-GRAINED DYNAMICALLY RECONFIGURABLE PROCESSOR ARRAYS (1)
COLHALF DESIGN (1)
COLLECTIVE COMMUNICATION (1)
COMPILER PERSPECTIVE (1)
CONTEXT MODELING (1)
CORRELATED MULTISTREAM PROCESSING (1)
CRAY-T3D (1)
DELAY (1)
DEPENDABLE RESPONSIVE MULTITHREADED PROCESSOR II (1)
DEVICE VIRTUALIZATION (1)
DISTRIBUTED CONTEXTUAL DATA FUSION (1)
DISTRIBUTED CONTROL (1)
DISTRIBUTED DATABASES (1)
DISTRIBUTED SENSORS (1)
DUAL VT CELLS (1)
EMPIRICAL EVALUATION (1)
ENGINES (1)
FINE GRAINED THREADING (1)
FPGA BASED ADAPTIVE COMPUTING (1)
FUNCTIONAL PROGRAMMING (1)
GLOBAL-MEMORY ACCESS (1)
GPU (1)
GRAPHICS PROCESSING UNITS (1)
HAQU (1)
HARDWARE ACCELERATED QUEUEING (1)
HARDWARE OPERATING SYSTEM (1)
HARDWARE QUEUES (1)
HASKELL (1)
HIGH PERFORMANCE NETWORKING (1)
HIGH-PERFORMANCE NETWORK INTERFACES (1)
HISTORY (1)
I-O INTERFACE (1)
INTELLIGENT SENSORS (1)
JOINT DIRECTORS OF LABORATORIES DATA FUSION HIERARCHY (1)
LARGE-SCALE SYSTEMS (1)
LEAKAGE POWER REDUCTION (1)
LIBRARIES (1)
LOGIC (1)
LOW-LATENCY EXECUTION (1)
LOWHALF DESIGN (1)
MACHINE ARCHITECTURE (1)
MATHEMATICAL MODEL (1)
MESSAGE PASSING (1)
MESSAGE RATE (1)
MESSAGE SYSTEMS (1)
MICRO-BENCHMARKS (1)
MICROPROCESSOR CHIPS (1)
MICROPROCESSORS (1)
MODERN PARALLEL COMPUTING SYSTEMS (1)
MONITORING (1)
MPI (1)
MUCCRA-3T (1)
MULT DESIGN (1)
MULTI-CORE ARCHITECTURES (1)
MULTI-THREADING TECHNIQUES (1)
MULTIPLE STREAM PROCESSORS (1)
MULTIPROCESSOR SYSTEM-ON-CHIP ARCHITECTURE (1)
MULTITHREADED PROGRAMS (1)
MULTSW DESIGN (1)
NETWORK ARCHITECTURE (1)
NETWORK INTERFACES (1)
NETWORK ROUTING (1)
OPERATING SYSTEM (1)
OPERATING SYSTEMS (COMPUTERS) (1)
PARALLEL COMPUTING (1)
PARALLEL LANGUAGE (1)
PARALLEL LANGUAGES (1)
PARALLEL SYSTEM (1)
PARALLELISM (1)
PARALLELIZATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options