Search results

chapter

Disruption-free software updates in automation systems

Michael Wahler, Manuel Oriol

Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA) > 1 - 8

2014 IEEE Emerging Technology and Factory Automation (ETFA)

Automation systems must primarily be deterministic and reliable, especially in safety-critical environments. With recent trends such as mass customization or Industry 4.0, there is an increasing need for automation systems to be dynamic. Changing parts of the software of today's automation systems, however, typically requires rebooting the controller, which makes software updates a complex and costly...

chapter

How Processor Speedups Can Slow Down I/O Performance

Hung-Ching Chang, Bo Li, Matthew Grove, Kirk W. Cameron

2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems > 395 - 404

2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS

Power states in power-scalable systems are managed to maximize performance and reduce energy waste. Power-scalable processor capabilities (e.g., Intel Turbo Boost) embrace a "faster is better" approach to power management. While these technologies can vastly improve performance and energy efficiency, there is a growing body of evidence that "faster is not always better". For example,...

chapter

CASITA: A Tool for Identifying Critical Optimization Targets in Distributed Heterogeneous Applications

Felix Schmitt, Jonas Stolle, Robert Dietrich

2014 43rd International Conference on Parallel Processing Workshops > 186 - 195

2014 43nd International Conference on Parallel Processing Workshops (ICCPW)

Programming of high performance computing systems has become more complex over time. Several layers of parallelism need to be exploited to efficiently utilize the available resources. To support application developers and performance analysts we propose a technique for identifying the most performance critical optimization targets in distributed heterogeneous applications. We have developed CASITA,...

chapter

Adaptive Algorithm and Tool Flow for Accelerating System C on Many-Core Architectures

Christoph Roth, Simon Reder, Harald Bucher, Oliver Sander, more

2014 17th Euromicro Conference on Digital System Design > 137 - 145

2014 17th Euromicro Conference on Digital System Design (DSD)

Within this paper an adaptive approach for parallel simulation of SystemC RTL models on future many-core architectures like the Single-chip Cloud Computer (SCC) from Intel is presented. It is based on a configurable parallel SystemC kernel that preserves the partial order defined by the SystemC delta cycles while avoiding global synchronization as far as possible. The underlying algorithm relies on...

chapter

DMCTCP: Desynchronized Multi-Channel TCP for high speed access networks with tiny buffers

Cheng Cui, Lin Xue, Chui-Hui Chiu, Praveenkumar Kondikoppa, more

2014 23rd International Conference on Computer Communication and Networks (ICCCN) > 1 - 8

2014 23rd International Conference on Computer Communication and Networks (ICCCN)

The past few years have witnessed debate on how to improve link utilization of high-speed tiny-size buffer routers. Widely argued proposals for TCP traffic to realize acceptable link capacities mandate: (i) over-provisioned core link bandwidth; and (ii) non-bursty flows; and (iii) tens of thousands of asynchronous flows. However, in high speed access networks where flows are bursty, sparse and synchronous,...

chapter

Times square - marriage of real-time and logical-time in GALS and synchronous languages

HeeJong Park, Avinash Malik, Zoran Salcic

2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications > 1 - 10

2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)

In this paper we introduce exact and non-exact real-time waits in reactive Globally Asynchronous Locally Synchronous (GALS) programming languages and synchronous languages as their subset. The language constructs that allow use of real-time waits are illustrated on the SystemJ GALS language. They allow system designers to explicitly use, at the specification level, not only logical time but also the...

chapter

A Flexible and Scalable Affinity Lock for the Kernel

Benlong Zhang, Junbin Kang, Tianyu Wo, Yuda Wang, more

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS) > 34 - 37

2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)

A number of NUMA-aware synchronization algorithms have been proposed lately to stress the scalability inefficiencies of existing locks. However their presupposed local lock granularity, a physical processor, is often not the optimum configuration for various workloads. This paper further explores the design space by taking into consideration the physical affinity between the cores within a single...

chapter

Data Interception through Broken Concurrency in Kernel Land

Julian L. Rrushi

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS) > 785 - 793

2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)

We present a kernel data interception technique that is undetectable by existing approaches to malware detection, and propose practical methods to detect it. The technique is based on breaking concurrency in a way that enables the attack code to take over the synchronization established by target kernel modules. That level of control allows the attack code to interpose between those modules, and thus...

chapter

Multi Sloth: An Efficient Multi-core RTOS Using Hardware-Based Scheduling

Rainer Muller, Daniel Danner, Wolfgang Schroder Preikschat, Daniel Lohmann

2014 26th Euromicro Conference on Real-Time Systems > 189 - 198

2014 26th Euromicro Conference on Real-Time Systems (ECRTS)

Multi-core operating systems inherently face the problem of concurrent access to internal kernel state held in shared memory. Previous work on the Sloth real-time kernel proposed to offload the scheduling decisions to the interrupt hardware, thus removing the need for a software scheduler, no state has to be managed in software. While our existing design covers single-core platforms only, we now present...

chapter

Sparse matrix computations on clusters with GPGPUs

Valeria Cardellini, Alessandro Fanfarillo, Salvatore Filippone

2014 International Conference on High Performance Computing & Simulation (HPCS) > 23 - 30

2014 International Conference on High Performance Computing & Simulation (HPCS)

Hybrid nodes containing GPUs are rapidly becoming the norm in parallel machines. We have conducted some experiments regarding how to plug GPU-enabled computational kernels into PSBLAS, a MPI-based library specifically geared towards sparse matrix computations. In this paper, we present our findings on which strategies are more promising in the quest for the optimal compromise among raw performance,...

chapter

“Swimming pool”-like distributed architecture for clock generation in large many-core SoC

Chuan Shan, Francois Anceau, Dimitri Galayko, Eldar Zianbetov

2014 IEEE International Symposium on Circuits and Systems (ISCAS) > 2768 - 2771

2014 IEEE International Symposium on Circuits and Systems (ISCAS)

Synchronization is an issue of significant importance in large-scale, distributed and high-speed systems. Traditional globally synchronous approach is no longer viable due to severe wire delay. Solutions such as “Globally Asynchronous, Locally Synchronous (GALS)” approaches suffer from metastability risk limiting their use in many-core SoC for critical applications, such as aerospace, military or...

chapter

SimParallel: A high performance parallel SystemC simulator using hierarchical multi-threading

Moo-Kyoung Chung, Jun-Kyoung Kim, Soojung Ryu

2014 IEEE International Symposium on Circuits and Systems (ISCAS) > 1472 - 1475

2014 IEEE International Symposium on Circuits and Systems (ISCAS)

As the system complexity increases, the simulation performance becomes one of the most important issues in virtual prototyping. Parallel simulation is a fascinating technique for high-speed simulation utilizing state of the art multi-core processors on a host workstation, but the efficiency of the parallel simulation is low because of the synchronization and communication overhead and unbalanced workloads...

chapter

Tyche: An efficient Ethernet-based protocol for converged networked storage

Pilar Gonzalez-Ferez, Angelos Bilas

2014 30th Symposium on Mass Storage Systems and Technologies (MSST) > 1 - 11

2014 30th Symposium on Mass Storage Systems and Technologies (MSST)

Current technology trends for efficient use of infrastructures dictate that storage converges with computation by placing storage devices, such as NVM-based cards and drives, in the servers themselves. With converged storage the role of the interconnect among servers becomes more important for achieving high I/O throughput. Given that Ethernet is emerging as the dominant technology for datacenters,...

chapter

CLUE: System trace analytics for cloud service performance diagnosis

Hui Zhang, Junghwan Rhee, Nipun Arora, Sahan Gamage, more

2014 IEEE Network Operations and Management Symposium (NOMS) > 1 - 9

NOMS 2014 - 2014 IEEE/IFIP Network Operations and Management Symposium

In this paper, we present CLUE, a system event analytics tool for black-box performance diagnosis in production Cloud Computing systems. CLUE provides an unified and extensible means of profiling service transactional behaviors, and builds structured data called event sketches. CLUE further offers a set of analytic tools for summarizing and analyzing event sketches by integrating data mining and statistical...

chapter

Scalable Critical Path Analysis for Hybrid MPI-CUDA Applications

Felix Schmitt, Robert Dietrich, Guido Juckeland

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 908 - 915

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

Utilizing accelerators in heterogeneous systems is an established approach for designing peta-scale applications. Today, CUDA offers a rich programming interface for GPU accelerators but requires developers to incorporate several layers of parallelism on both CPU and GPU. From this increasing program complexity emerges the need for sophisticated performance tools. This work contributes by analyzing...

chapter

Parallelism Extraction Algorithm from Stream-Based Processing Flow Applying Spanning Tree

Guyue Wang, Shinichi Yamagiwa, Koichi Wada

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 632 - 641

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

Manycore architecture promotes a massively parallel computing on the accelerators. Especially GPU is one of the main series of the high performance computing, which is also employed by top supercomputers in the world. The programming method on such accelerators includes development of a control program. The accelerator executes it to schedule the invocation timing of the accelerator's kernel program...

chapter

Hierarchical Pipeline Optimization of Coarse Grained Reconfigurable Processor for Multimedia Applications

Chen Mei, Peng Cao, Yang Zhang, Bo Liu, more

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 281 - 286

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

Nowadays, driven by the consumer demands, the multimedia market is booming and the video coding standards evolve rapidly. A dynamically coarse grain reconfigurable architecture REMUS-II (REconfigurable MUltimedia System 2) is developed as a multi-standards, high resolution, power efficient, and real-time multimedia decoding processor. The hierarchical pipeline is adopted in the REMUS-II for multimedia...

chapter

Supporting triple-play communications with TDuCSMA and first experiments

Andrea Vesco, Riccardo M. Scopigno, Enrico Masala

2014 IEEE Wireless Communications and Networking Conference (WCNC) > 3260 - 3265

2014 IEEE Wireless Communications and Networking Conference (WCNC)

This work addresses the implications of using the Time-Division Unbalanced Carrier Sense Multiple Access (TDuCSMA) coordination function to support triple-play services. Firstly, the theoretical background of TDuCSMA is reported, presenting its advantages and discussing its full compliance with the IEEE 802.11 standard. Secondly, a prototype of TDuCSMA is discussed in details. Then, a set of experiments...

chapter

Kernel data race detection using debug register in Linux

Yunyun Jiang, Yi Yang, Tian Xiao, Tianwei Sheng, more

2014 IEEE COOL Chips XVII > 1 - 3

2014 IEEE COOL Chips XVII (COOL Chips)

Data races in parallel programs are notoriously difficult to detect and resolve. Existing research has mostly focused on data race detection at the user level and significant progress has been made in this regard. It is difficult to apply detection methods designed for user-level applications to identify OS kernel level races. In this paper, we present a new detection tool that is able to effectively...

chapter

VMCSnap: Taking Snapshots of Virtual Machine Cluster with Memory Deduplication

Yumei Huang, Renyu Yang, Lei Cui, Tianyu Wo, more

2014 IEEE 8th International Symposium on Service Oriented System Engineering > 314 - 319

2014 IEEE 8th International Symposium on Service Oriented System Engineering (SOSE)

Virtualization is one of the main technologies currently used to deploy computing systems due to the high reliability and rapid crash recovery it offers in comparison to physical nodes. These features are mainly achieved by continuously producing snapshots of the status of running virtual machines. In earlier works, the snapshot of each individual VM is performed independently, ignoring the memory...

INFONA - science communication portal

Search results

Disruption-free software updates in automation systems

How Processor Speedups Can Slow Down I/O Performance

CASITA: A Tool for Identifying Critical Optimization Targets in Distributed Heterogeneous Applications

Adaptive Algorithm and Tool Flow for Accelerating System C on Many-Core Architectures

DMCTCP: Desynchronized Multi-Channel TCP for high speed access networks with tiny buffers

Times square - marriage of real-time and logical-time in GALS and synchronous languages

A Flexible and Scalable Affinity Lock for the Kernel

Data Interception through Broken Concurrency in Kernel Land

Multi Sloth: An Efficient Multi-core RTOS Using Hardware-Based Scheduling

Sparse matrix computations on clusters with GPGPUs

“Swimming pool”-like distributed architecture for clock generation in large many-core SoC

SimParallel: A high performance parallel SystemC simulator using hierarchical multi-threading

Tyche: An efficient Ethernet-based protocol for converged networked storage

CLUE: System trace analytics for cloud service performance diagnosis

Scalable Critical Path Analysis for Hybrid MPI-CUDA Applications

Parallelism Extraction Algorithm from Stream-Based Processing Flow Applying Spanning Tree

Hierarchical Pipeline Optimization of Coarse Grained Reconfigurable Processor for Multimedia Applications

Supporting triple-play communications with TDuCSMA and first experiments

Kernel data race detection using debug register in Linux

VMCSnap: Taking Snapshots of Virtual Machine Cluster with Memory Deduplication

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options