Search results

Items from 1 to 20 out of 34 results

chapter

Developing CPU-GPU Embedded Systems Using Platform-Agnostic Components

Gabriel Campeanu, Jan Carlson, Severine Sentilles

2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA) > 176 - 180

2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA)

Nowadays, there are many embedded systems with different architectures that have incorporated GPUs. However, it is difficult to develop CPU-GPU embedded systems using component-based development (CBD), since existing CBD approaches have no support for GPU development. In this context, when targeting a particular CPU-GPU platform, the component developer is forced to construct hardware-specific components,...

chapter

Implementing state machines in distributed event-based systems

Holger Zipper, Marco Meier, Elke Hintze, Christian Diedrich

2017 3rd International Conference on Event-Based Control, Communication and Signal Processing (EBCCSP) > 1 - 7

2017 3rd International Conference on Event-Based Control, Communication and Signal Processing (EBCCSP)

State machines are a common technique to describe state dependent systems such as communication protocols. Although such state machines typically incorporate events to switch between states, a description based on a pure event-based system is quite challenging. In this work, we describe the factors that complicate state machines on event basis and present solutions. These solutions are developed especially...

chapter

Responsive Task for Real-Time Communication

Hiroyuki Chishiro, Kohei Osawa, Nobuyuki Yamasaki

2017 IEEE 20th International Symposium on Real-Time Distributed Computing (ISORC) > 52 - 59

2017 IEEE 20th International Symposium on Real-Time Distributed Computing (ISORC)

Humanoid robots are typical application of real-time systems and have required timing constraints, low-latency, and parallel/distributed processing to achieve fine-grained real-time execution. Therefore, we have developed Dependable Responsive Multithreaded Processor I (D-RMTP I), which has one Responsive Multithreaded Processing Unit with an 8-way prioritized Simultaneous Multithreading architecture...

chapter

SysWCET: Whole-System Response-Time Analysis for Fixed-Priority Real-Time Systems (Outstanding Paper)

Christian Dietrich, Peter Wagemann, Peter Ulbrich, Daniel Lohmann

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) > 37 - 48

2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)

The worst-case response time (WCRT) – the time span from release to completion of a real-time task – is a crucial property of real-time systems. However, WCRT analysis is complex in practice, as it depends not only on the realistic examination of worst-case execution times (WCET), but also on system-level overheads and blocking/preemption times. While the implicit path enumeration technique (IPET)...

chapter

Fast register consolidation and migration for heterogeneous multi-core processors

Elliott Forbes, Eric Rotenberg

2016 IEEE 34th International Conference on Computer Design (ICCD) > 1 - 8

2016 IEEE 34th International Conference on Computer Design (ICCD)

Single-ISA heterogeneous multi-core processors have been demonstrated to improve the performance and efficiency of general-purpose workloads. However, these designs leave some performance on the table due to the common assumption that the cost of migrating a program from one core to another is high. This high cost is due to the reliance on the operating system for a migration via a context switch...

chapter

Parallel Responsive Task on Dependable Responsive Multithreaded Processor II

Hiroyuki Chishiro, Yusuke Hatori, Kohei Osawa, Keigo Mizotani, more

2016 IEEE 4th International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA) > 89 - 94

2016 IEEE 4th International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA)

Cyber-Physical Systems (CPS) are tight integrations of computational and physical worlds for various kinds of applications. For example, a humanoid robot, which is a typical application of CPS, has required timing constraints, low-latency execution, and parallel processing to achieve fine-grained real-time execution. Therefore low-latency parallel real-time computing is an important factor for CPS...

chapter

Configurable and Efficient Memory Access Tracing via Selective Expression-Based x86 Binary Instrumentation

Simone Economo, Davide Cingolani, Alessandro Pellegrini, Francesco Quaglia

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) > 261 - 270

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)

Memory access tracing is aprogram analysis technique with many different applications, ranging from architectural simulation to (on-line) data placement optimization and security enforcement. In this article we propose a memory access tracing approach based on static x86 binary instrumentation. Unlike non-selective schemes, whichinstrument all the memory access instructions, our proposal selectively...

chapter

Optimized high-level synthesis of SMT multi-threaded hardware accelerators

Jens Huthmann, Andreas Koch

2015 International Conference on Field Programmable Technology (FPT) > 176 - 183

2015 International Conference on Field Programmable Technology (FPT)

Recent high-level synthesis tools offer the capability to generate multi-threaded micro-architectures to hide memory access latencies. In many HLS flows, this is often achieved by just creating multiple processing element-instances (one for each thread). However, more advanced compilers can synthesize hardware in a spatial form of the barrel processor- or simultaneous multi-threading (SMT) approaches,...

chapter

Fast and Scalable Thread Migration for Multi-core Architectures

Miguel Rodrigues, Nuno Roma, Pedro Tomas

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing > 9 - 16

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing (EUC)

Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based...

chapter

Software-Based Lightweight Multithreading to Overlap Memory-Access Latencies of Commodity Processors

Cihang Jiang, Youhui Zhang, Weimin Zheng

2015 44th International Conference on Parallel Processing > 619 - 628

2015 44th International Conference on Parallel Processing (ICPP)

Emerging services applications operate on vast datasets that are kept in DRAM to minimize latency and to improve throughput. A considerable part of them have irregular memory references and then caused the serious locality issue. This paper presents a Software-based LIght weight Multithreading framework, SLIM, to conquer this problem for commodity hardware, which still keeps the simple style of multithreading...

chapter

Automatic high-level synthesis of multi-threaded hardware accelerators

Jens Huthmann, Julian Oppermann, Andreas Koch

2014 24th International Conference on Field Programmable Logic and Applications (FPL) > 1 - 4

2014 24th International Conference on Field Programmable Logic and Applications (FPL)

We describe extending the hardware/software co-compiler Nymble to automatically generate multi-threaded (SIMT) hardware accelerators. In contrast to prior work that simply duplicated complete compute units for each thread, Nymble-MT reuses the actual computation elements, and adds just the required data storage and context switching logic. On the CHStone benchmark suite and a sample configuration...

chapter

Porting SLOTH system to FreeRTOS running on ARM Cortex-M3

S. Pinto, J. Pereira, D. Oliveira, F. Alves, more

2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE) > 1888 - 1893

2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE)

Traditionally, operating system (OSes) suffers from a bifid priority space dictated by the co-existence of threads managed by kernel scheduler and asynchronous interrupt handlers scheduled by hardware. On real-time systems, where reliability and determinism plays a critical role, this approach presents a noteworthy lack, as any interrupt handler can interrupt an execution thread, regardless of its...

chapter

Enabling preemptive multiprogramming on GPUs

Ivan Tanasic, Isaac Gelado, Javier Cabezas, Alex Ramirez, more

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA) > 193 - 204

2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)

GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements,...

chapter

HATI: Hardware Assisted Thread Isolation for Concurrent C/C++ Programs

Juan Carlos Martinez Santos, Yunsi Fei

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 322 - 331

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

On a resource-sharing platform, running software subcomponents in isolation is critical to protect user's privacy and data security. In client-server applications, thread isolation is required to prevent private data that only belongs to certain threads from being read or modified by other unauthorized threads running in the same address space. However, the current programming languages (C/C++) and...

chapter

KCoFI: Complete Control-Flow Integrity for Commodity Operating System Kernels

John Criswell, Nathan Dautenhahn, Vikram Adve

2014 IEEE Symposium on Security and Privacy > 292 - 307

2014 IEEE Symposium on Security and Privacy (SP)

We present a new system, KCoFI, that is the first we know of to provide complete Control-Flow Integrity protection for commodity operating systems without using heavyweight complete memory safety. Unlike previous systems, KCoFI protects commodity operating systems from classical control-flow hijack attacks, return-to-user attacks, and code segment modification attacks. We formally verify a subset...

chapter

Performance Impact of Futex on Virtual Machines

Yu Zhang, Rene Oertel, Wolfgang Rehm

2013 European Modelling Symposium > 657 - 663

2013 European Modelling Symposium (EMS)

As discovered in our previous benchmark works, a small number of workloads in PARSEC benchmark suite suffer from heavy performance loss in a virtual execution environment, of which the major loss exhibits fairly a strong connection with the thread synchronization operations. This paper examines one workload of this kind that makes heavy use of thread synchronization operations, and shows the performance...

chapter

On memory allocation for high-speed packet analysis applications

Nicola Bonelli, Loris Gazzarrini, Stefano Giordano, Gregorio Procissi, more

2013 IEEE International Conference on Communications (ICC) > 3814 - 3818

ICC 2013 - 2013 IEEE International Conference on Communications

The evolution of commodity hardware makes it a very attractive platform to develop high-performance networking applications that are affordable to deploy. All but the most trivial applications must copy packets into user-space for further analysis. Therefore, the allocation of memory for these copies becomes a performance-critical operation. In this work, we present a multi-layer slice memory allocator...

chapter

Apple-CORE: Microgrids of SVP Cores -- Flexible, General-Purpose, Fine-Grained Hardware Concurrency Management

Raphael Poss, Mike Lankamp, Qiang Yang, Jian Fu, more

2012 15th Euromicro Conference on Digital System Design > 501 - 508

2012 15th Euromicro Conference on Digital System Design (DSD)

To harness the potential of CMPs for scalable, energy-efficient performance in general-purpose computers, the Apple-CORE project has co-designed a general machine model and concurrency control interface with dedicated hardware support for concurrency management across multiple cores. Its SVP interface combines dataflow synchronisation with imperative programming, towards the efficient use of parallelism...

chapter

iGPU: Exception support and speculative execution on GPUs

Jaikrishnan Menon, Marc de Kruijf, Karthikeyan Sankaralingam

2012 39th Annual International Symposium on Computer Architecture (ISCA) > 72 - 83

2012 ACM/IEEE 39th International Symposium on Computer Architecture (ISCA)

Since the introduction of fully programmable vertex shader hardware, GPU computing has made tremendous advances. Exception support and speculative execution are the next steps to expand the scope and improve the usability of GPUs. However, traditional mechanisms to support exceptions and speculative execution are highly intrusive to GPU hardware design. This paper builds on two related insights to...

chapter

Simultaneous branch and warp interweaving for sustained GPU performance

Nicolas Brunie, Sylvain Collange, Gregory Diamos

2012 39th Annual International Symposium on Computer Architecture (ISCA) > 49 - 60

2012 ACM/IEEE 39th International Symposium on Computer Architecture (ISCA)

Instruction Multiple-Thread (SIMT) micro-architectures implemented in Graphics Processing Units (GPUs) run fine-grained threads in lockstep by grouping them into units, referred to as warps, to amortize the cost of instruction fetch, decode and control logic over multiple execution units. As individual threads take divergent execution paths, their processing takes place sequentially, defeating part...

Keywords:
CONTEXT
HARDWARE
INSTRUCTION SETS

Publication date

Set your own date range

Content availability

Available (33)
None (1)

Keywords

REGISTERS (10)
KERNEL (6)
PIPELINES (6)
REAL-TIME SYSTEMS (6)
SWITCHES (6)
BENCHMARK TESTING (5)
COMPUTER ARCHITECTURE (5)
OPERATING SYSTEMS (5)
SYNCHRONIZATION (5)
REAL TIME SYSTEMS (4)
CLOCKS (3)
EMBEDDED SYSTEMS (3)
GRAPHICS PROCESSING UNIT (3)
MEMORY MANAGEMENT (3)
MULTICORE PROCESSING (3)
MULTIPROCESSING SYSTEMS (3)
OPERATING SYSTEMS (COMPUTERS) (3)
PARALLEL PROCESSING (3)
SMT (3)
CACHE STORAGE (2)
FIELD PROGRAMMABLE GATE ARRAYS (2)
FPGA (2)
GPU (2)
GRAPHICS PROCESSING UNITS (2)
INSTRUMENTS (2)
MESSAGE SYSTEMS (2)
MICROPROCESSOR CHIPS (2)
MULTI-CORE (2)
MULTI-THREADING (2)
MULTITHREADING (2)
PERFORMANCE (2)
PREFETCH (2)
PROCESSOR SCHEDULING (2)
PROGRAM PROCESSORS (2)
PROPOSALS (2)
RESOURCE MANAGEMENT (2)
RESPONSIVE TASK (2)
RUNTIME (2)
SYSTEM-ON-CHIP (2)
THREAD MIGRATION (2)
ABSTRACT RTOS MODEL (1)
ACCELERATION (1)
ADAPTIVE PROCESSORS (1)
AEROSPACE ELECTRONICS (1)
ARM CORTEX-M3 (1)
ASSERTION-BASED VERIFICATION (1)
BASIC BLOCK TRANSLATION (1)
BINARY CODES (1)
BLOCK CODES (1)
BUFFER STORAGE (1)
CBD (1)
CHIP MULTIPROCESSOR (1)
CHIP MULTIPROCESSOR SYSTEM (1)
COLD CODE EMULATION (1)
COMMERCIAL PROCESSORS (1)
COMMUNICATION STANDARDS (1)
COMPILER (1)
COMPONENT API (1)
COMPONENT CODE TEMPLATE (1)
COMPONENT-BASED DEVELOPMENT (1)
COMPUTATIONAL MODELING (1)
CONCURRENCY (1)
CONCURRENT COMPUTING (1)
CONCURRENT HPC PROGRAMS (1)
CONTEXT SWITCH (1)
CONTEXT SWITCH TIMES (1)
CONTROL-FLOW INTEGRITY (1)
CPU-GPU EMBEDDED SYSTEMS (1)
CREDIT SCHEDULER (1)
CUSTOM INSTRUCTION (1)
DECODED INSTRUCTION INFORMATION (1)
DECODING (1)
DEPENDABLE RESPONSIVE MULTITHREADED PROCESSOR (1)
DEPENDABLE RESPONSIVE MULTITHREADED PROCESSOR II (1)
DESIGN (1)
DIGITAL SIGNAL PROCESSING CHIPS (1)
DISPERSION (1)
DIVERGENCE (1)
DVFS (1)
DYNAMIC BINARY TRANSLATION SYSTEM (1)
DYNAMIC SCHEDULING (1)
DYNAMIC SWITCHING-FREQUENCY SCALING (1)
EARLIEST DEADLINE FIRST (1)
EMBEDDED REAL TIME SYSTEMS (1)
EMULATION (1)
ENERGY CONSERVATION (1)
ENERGY EFFICIENCY (1)
ENERGY MINIMIZATION (1)
ENERGY SAVING (1)
ENERGY-AWARE ADAPTIVE PROCESSORS (1)
ENGINES (1)
FABRICS (1)
FINE GRAINED THREADING (1)
FORMAL VERIFICATION (1)
FREE BSD (1)
FREERTOS (1)
FUTEX (1)
more

INFONA - science communication portal

Search results

Developing CPU-GPU Embedded Systems Using Platform-Agnostic Components

Implementing state machines in distributed event-based systems

Responsive Task for Real-Time Communication

SysWCET: Whole-System Response-Time Analysis for Fixed-Priority Real-Time Systems (Outstanding Paper)

Fast register consolidation and migration for heterogeneous multi-core processors

Parallel Responsive Task on Dependable Responsive Multithreaded Processor II

Configurable and Efficient Memory Access Tracing via Selective Expression-Based x86 Binary Instrumentation

Optimized high-level synthesis of SMT multi-threaded hardware accelerators

Fast and Scalable Thread Migration for Multi-core Architectures

Software-Based Lightweight Multithreading to Overlap Memory-Access Latencies of Commodity Processors

Automatic high-level synthesis of multi-threaded hardware accelerators

Porting SLOTH system to FreeRTOS running on ARM Cortex-M3

Enabling preemptive multiprogramming on GPUs

HATI: Hardware Assisted Thread Isolation for Concurrent C/C++ Programs

KCoFI: Complete Control-Flow Integrity for Commodity Operating System Kernels

Performance Impact of Futex on Virtual Machines

On memory allocation for high-speed packet analysis applications

Apple-CORE: Microgrids of SVP Cores -- Flexible, General-Purpose, Fine-Grained Hardware Concurrency Management

iGPU: Exception support and speculative execution on GPUs

Simultaneous branch and warp interweaving for sustained GPU performance

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options