Advanced search

chapter

A review on fault-tolerant control of PMSM

Wangguang, Zhonghua Wang, Dongxue Wang, Yueyang Li, more

2017 Chinese Automation Congress (CAC) > 3854 - 3859

2017 Chinese Automation Congress (CAC)

The permanent magnet synchronous motor (PMSM) supplied by an inverter plays key roles in the critical application. Therefore, many efforts have been paid to the fault tolerant of the PMSM drive system to ensure the system continue operate in the postfault situation. Fault tolerant include fault detection and fault diagnosis> remedial action combinationing of hardware and software reconfigurations...

chapter

On the Robustness of a Neural Network

El Mahdi El Mhamdi, Rachid Guerraoui, Sebastien Rouault

2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS) > 84 - 93

2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)

With the development of neural networks based machine learning and their usage in mission critical applications, voices are rising against the black box aspect of neural networks as it becomes crucial to understand their limits and capabilities. With the rise of neuromorphic hardware, it is even more critical to understand how a neural network, as a distributed system, tolerates the failures of its...

chapter

Evaluating the Viability of Using Compression to Mitigate Silent Corruption of Read-Mostly Application Data

Scott Levy, Kurt B. Ferreira, Patrick G. Bridges

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 603 - 607

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Aggregating millions of hardware components to construct an exascale computing platform will pose significant resilience challenges. In addition to slowdowns associated with detected errors, silent errors are likely to further degrade application performance. Moreover, silent data corruption (SDC) has the potential to undermine the integrity of the results produced by important scientific applications...

chapter

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Grzegorz Pawelczak, Simon McIntosh-Smith, James Price, Matt Martineau

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 733 - 740

2017 IEEE International Conference on Cluster Computing (CLUSTER)

The continuous growth of high-performance computing (HPC) systems has lead to Fault Tolerance (FT) being identified as one of the major challenges for exascale computing, due to the expected decrease in Mean Time Between Failures (MTBF). One source of faults are soft errors, which can cause bit corruptions to the data held in memory. Current solutions for protection against these errors include hardware...

chapter

Uniform dispersal of silent oblivious robots

Attila Hideg, Lukovszki Tamas, Bertalan Forstner

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) > 175 - 180

2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY)

Consider the Filling problem, in which a set of mobile robots enter an unknown area and have to disperse in that area. The robots are homogeneous, anonymous, autonomous, have limited visibility radius, and do not use explicit communication. Moreover, these robots are oblivious, i.e. they do not have any bits of persistent memory. It is already known that these limitations prevent the creation of a...

chapter

An integrated design environment of fault tolerant processors with flexible HW/SW solutions for versatile performance/cost/coverage tradeoffs

Yi-Ju Ke, Yi-Chieh Ghen, Jng-Jer Huang

2017 International Test Conference in Asia (ITC-Asia) > 162 - 167

2017 International Test Conference in Asia (ITC-Asia)

This paper presents an integrated design environment (IDE) for embedded fault-tolerant processor system. It takes in a processor core IP and the embedded software which is to be executed on the given processor, and turns them into a fault-tolerant system with various hardware and software mechanisms, subject to the designer's selection. The hardware options include dual redundancy for processor core,...

chapter

Fault-tolerant multichannel digital averaging converter

A. I. Gulin, N. M. Safyannikov, O.I. Bureneva

2017 IEEE East-West Design & Test Symposium (EWDTS) > 1 - 4

2017 IEEE East-West Design & Test Symposium (EWDTS)

In this report, we suggest an approach to fault-tolerant multichannel digital averaging converter based on the usage of original structural organizations of hardware systems. Such systems are oriented on the processing of the measurement results presented in the pulse stream forms and perform primary functional conversions of pulse data flow on the base of integration of informational processes of...

chapter

Model-driven reliability evaluation for MPSoC design

Tien Thanh Nguyen, Anthony Mouraud, Mathieu Thevenin, Gwenole Corre, more

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP) > 1 - 6

2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)

When designing a Multi-Processor System-on-Chip (MPSoC), a very large range of design alternatives arises from a huge space of possible design options and component choices. Literature proposes numerous Design-Space-Exploration (DSE) approaches thats mainly focus on cost optimization. In this paper, we present a DSE approach which focuses on the reliability of the whole design. This approach is based...

chapter

A Watchdog Service Making Container-Based Micro-services Reliable in IoT Clouds

Antonio Celesti, Lorenzo Carnevale, Antonino Galletta, Maria Fazio, more

2017 IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud) > 372 - 378

2017 IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud)

The integration of Internet of Things (IoT) and Cloud computing has brought the rising of IoT Clouds able to provide different kinds of IoT as a Service solutions consisting of various micro-services deployed in IoT devices (including sensors and actuators) interacting with different Infrastructure, Platform, and Software as Service (i.e., IaaS, PaaS, SaaS) running in the Clouds' data centres. On...

chapter

Energy-efficient and error-resilient iterative solvers for approximate computing

Alexander Scholl, Claus Braun, Hans-Joachim Wunderlich

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS) > 237 - 239

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)

Iterative solvers like the Preconditioned Conjugate Gradient (PCG) method are widely-used in compute-intensive domains including science and engineering that often impose tight accuracy demands on computational results. At the same time, the error resilience of such solvers may change in the course of the iterations, which requires careful adaption of the induced approximation errors to reduce the...

chapter

Handling of permanent faults in dynamically scheduled processors

Felix Muhlbauer, Lukas Schroder, Mario Scholzel

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS) > 203 - 204

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)

This paper presents and evaluates a hybrid fault tolerance approach for dynamically scheduled processors that combines on-line error-correction for run-time fault handling with reconfiguration techniques for permanent fault handling. A permanent reconfiguration is triggered on-demand during runtime, depending on the frequency of on-line corrected faults. The presented work evaluates the effect of...

chapter

Design of self-repairing control circuit for brushless DC motor based on evolvable hardware

Ping Zhu, Rui Yao, Junjie Du

2017 NASA/ESA Conference on Adaptive Hardware and Systems (AHS) > 214 - 220

2017 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)

Brushless DC motor is widely used in the space industry owing to its high performance, but the complex application environment brings a lot of damage factors to the motor. For example, the space radiation may damage the circuit device, and strong electromagnetic fields may interfere with motor operation. Therefore, the high reliability of the motor system becomes increasingly important. In order to...

chapter

Design of efficient error resilience in signal processing and control systems: From algorithms to circuits

Jacob Abraham, Suvadeep Banerjee, Abhijit Chatterjee

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS) > 192 - 195

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)

The proliferation of cyber physical systems in society, from the smart grid to sensor networks and robots has raised the importance of error resilience in signal processing and control systems to unprecedented levels. Resilience to errors in sensing and control algorithm execution in processors all the way down to circuits for sensing and actuation is of critical importance in safety-critical applications...

chapter

Implementation of gigabit ethernet controller with fault tolerance and prevention mechanism

Longfei Li, Zhanzhuang He, Jianfeng Wang, Yangchun Shi

2017 Prognostics and System Health Management Conference (PHM-Harbin) > 1 - 8

2017 Prognostics and System Health Management Conference (PHM-Harbin)

Due to its high bandwidth, good maintainability and flexibility, Gigabit Ethernet is confoundedly suitable for high-performance server applications. As an interface between network and host, Ethernet controller has been continually evolving to meet the ever increasing communication demands being placed on it by enterprise applications. For the Ethernet, network links and physical devices, such as...

chapter

Self-Repairing Software Architecture for Predictable Hardware Faults

Yinghua Guo, Yali Qi, Hang Zhou

2017 4th International Conference on Information Science and Control Engineering (ICISCE) > 1224 - 1228

2017 4th International Conference on Information Science and Control Engineering (ICISCE)

Because of hardware faults, the situation that the processor cannot perform properly is occurred frequently in large scale software-intensive systems. Most of traditional fault-tolerant methods do not distinguish the type of hardware failure. In view of this, we propose self-repairing software architecture for predictable hardware faults. By introducing computational reflection, the software architecture...

chapter

Formal Definition of Program Faults and Hierarchy of Program Fault-Tolerant Abilities

Liu Xiaojian, Jiang Ting, Dong Xiaofeng

2017 4th International Conference on Information Science and Control Engineering (ICISCE) > 339 - 343

2017 4th International Conference on Information Science and Control Engineering (ICISCE)

These two issues are addressed in this paper: 1) the formal definitions of the concepts relevant to program faults, and 2) the comparison and classification of program faulttolerant abilities. We firstly analyze the subtle differences among these basic concepts: faults, errors and failures, and represent their formal definitions by using the state-based theory of program behavior; and then we propose...

chapter

Fast power overhead prediction for hardware redundancy-based fault tolerance

Stefan Scharoba, Heinrich T. Vierhaus

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS) > 265 - 270

2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS)

Due to the downscaling of transistor feature sizes, nowadays integrated circuits are more vulnerable to various effects that can cause faults during operation. Appropriate mechanisms for handling these faults in the field are required to meet certain dependability demands nonetheless. At the same time, the overhead in chip area and power consumption that is caused by such fault tolerance techniques...

chapter

Highly-Available Applications on Unreliable Infrastructure: Microservice Architectures in Practice

Daniel Richter, Marcus Konrad, Katharina Utecht, Andreas Polze

2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C) > 130 - 137

2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C)

In contrast to applications relying on specialized and expensive highly-available infrastructure, the basic approach of microservice architectures to achieve fault tolerance – and finally high availability – is to modularize the software system into small, self-contained services that are connected via implementation-independent interfaces. Microservices and all dependencies are deployed into self-contained...

chapter

Hacking the Control Flow error detection mechanism

Giorgio Di Natale, Marie-Lise Flottes, Sophie Dupuis, Bruno Rouzeyre

2017 IEEE 2nd International Verification and Security Workshop (IVSW) > 51 - 56

2017 IEEE 2nd International Verification and Security Workshop (IVSW)

Many techniques have been proposed in literature to cope with transient, permanent and malicious faults in computing systems. Among these techniques for reliability improvement and fault tolerance, Control Flow Checking allows covering any fault affecting the part of the storing elements containing the executable program, as well as all the hardware components handling the program itself and its flow...

chapter

NEDA: NOP Exploitation with Dependency Awareness for Reliable VLIW Processors

Rafail Psiakis, Angeliki Kritikakou, Olivier Sentieys

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) > 391 - 396

2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Critical applications require reliable processors that combine performance with low cost and energy consumption. Very Long InstructionWord (VLIW) processors have inherent resource redundancy not constantly used due to application’s fluctuating Instruction Level Parallelism (ILP). Reliability through idle slots utilization is explored either at compile-time, increasing code size and storage requirements,...

INFONA - science communication portal

Advanced search

Advanced search in people

A review on fault-tolerant control of PMSM

On the Robustness of a Neural Network

Evaluating the Viability of Using Compression to Mitigate Silent Corruption of Read-Mostly Application Data

Application-Based Fault Tolerance Techniques for Fully Protecting Sparse Matrix Solvers

Uniform dispersal of silent oblivious robots

An integrated design environment of fault tolerant processors with flexible HW/SW solutions for versatile performance/cost/coverage tradeoffs

Fault-tolerant multichannel digital averaging converter

Model-driven reliability evaluation for MPSoC design

A Watchdog Service Making Container-Based Micro-services Reliable in IoT Clouds

Energy-efficient and error-resilient iterative solvers for approximate computing

Handling of permanent faults in dynamically scheduled processors

Design of self-repairing control circuit for brushless DC motor based on evolvable hardware

Design of efficient error resilience in signal processing and control systems: From algorithms to circuits

Implementation of gigabit ethernet controller with fault tolerance and prevention mechanism

Self-Repairing Software Architecture for Predictable Hardware Faults

Formal Definition of Program Faults and Hierarchy of Program Fault-Tolerant Abilities

Fast power overhead prediction for hardware redundancy-based fault tolerance

Highly-Available Applications on Unreliable Infrastructure: Microservice Architectures in Practice

Hacking the Control Flow error detection mechanism

NEDA: NOP Exploitation with Dependency Awareness for Reliable VLIW Processors

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options