The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The permanent magnet synchronous motor (PMSM) supplied by an inverter plays key roles in the critical application. Therefore, many efforts have been paid to the fault tolerant of the PMSM drive system to ensure the system continue operate in the postfault situation. Fault tolerant include fault detection and fault diagnosis> remedial action combinationing of hardware and software reconfigurations...
With the development of neural networks based machine learning and their usage in mission critical applications, voices are rising against the black box aspect of neural networks as it becomes crucial to understand their limits and capabilities. With the rise of neuromorphic hardware, it is even more critical to understand how a neural network, as a distributed system, tolerates the failures of its...
Aggregating millions of hardware components to construct an exascale computing platform will pose significant resilience challenges. In addition to slowdowns associated with detected errors, silent errors are likely to further degrade application performance. Moreover, silent data corruption (SDC) has the potential to undermine the integrity of the results produced by important scientific applications...
The continuous growth of high-performance computing (HPC) systems has lead to Fault Tolerance (FT) being identified as one of the major challenges for exascale computing, due to the expected decrease in Mean Time Between Failures (MTBF). One source of faults are soft errors, which can cause bit corruptions to the data held in memory. Current solutions for protection against these errors include hardware...
Consider the Filling problem, in which a set of mobile robots enter an unknown area and have to disperse in that area. The robots are homogeneous, anonymous, autonomous, have limited visibility radius, and do not use explicit communication. Moreover, these robots are oblivious, i.e. they do not have any bits of persistent memory. It is already known that these limitations prevent the creation of a...
This paper presents an integrated design environment (IDE) for embedded fault-tolerant processor system. It takes in a processor core IP and the embedded software which is to be executed on the given processor, and turns them into a fault-tolerant system with various hardware and software mechanisms, subject to the designer's selection. The hardware options include dual redundancy for processor core,...
In this report, we suggest an approach to fault-tolerant multichannel digital averaging converter based on the usage of original structural organizations of hardware systems. Such systems are oriented on the processing of the measurement results presented in the pulse stream forms and perform primary functional conversions of pulse data flow on the base of integration of informational processes of...
When designing a Multi-Processor System-on-Chip (MPSoC), a very large range of design alternatives arises from a huge space of possible design options and component choices. Literature proposes numerous Design-Space-Exploration (DSE) approaches thats mainly focus on cost optimization. In this paper, we present a DSE approach which focuses on the reliability of the whole design. This approach is based...
The integration of Internet of Things (IoT) and Cloud computing has brought the rising of IoT Clouds able to provide different kinds of IoT as a Service solutions consisting of various micro-services deployed in IoT devices (including sensors and actuators) interacting with different Infrastructure, Platform, and Software as Service (i.e., IaaS, PaaS, SaaS) running in the Clouds' data centres. On...
Iterative solvers like the Preconditioned Conjugate Gradient (PCG) method are widely-used in compute-intensive domains including science and engineering that often impose tight accuracy demands on computational results. At the same time, the error resilience of such solvers may change in the course of the iterations, which requires careful adaption of the induced approximation errors to reduce the...
This paper presents and evaluates a hybrid fault tolerance approach for dynamically scheduled processors that combines on-line error-correction for run-time fault handling with reconfiguration techniques for permanent fault handling. A permanent reconfiguration is triggered on-demand during runtime, depending on the frequency of on-line corrected faults. The presented work evaluates the effect of...
Brushless DC motor is widely used in the space industry owing to its high performance, but the complex application environment brings a lot of damage factors to the motor. For example, the space radiation may damage the circuit device, and strong electromagnetic fields may interfere with motor operation. Therefore, the high reliability of the motor system becomes increasingly important. In order to...
The proliferation of cyber physical systems in society, from the smart grid to sensor networks and robots has raised the importance of error resilience in signal processing and control systems to unprecedented levels. Resilience to errors in sensing and control algorithm execution in processors all the way down to circuits for sensing and actuation is of critical importance in safety-critical applications...
Due to its high bandwidth, good maintainability and flexibility, Gigabit Ethernet is confoundedly suitable for high-performance server applications. As an interface between network and host, Ethernet controller has been continually evolving to meet the ever increasing communication demands being placed on it by enterprise applications. For the Ethernet, network links and physical devices, such as...
Because of hardware faults, the situation that the processor cannot perform properly is occurred frequently in large scale software-intensive systems. Most of traditional fault-tolerant methods do not distinguish the type of hardware failure. In view of this, we propose self-repairing software architecture for predictable hardware faults. By introducing computational reflection, the software architecture...
These two issues are addressed in this paper: 1) the formal definitions of the concepts relevant to program faults, and 2) the comparison and classification of program faulttolerant abilities. We firstly analyze the subtle differences among these basic concepts: faults, errors and failures, and represent their formal definitions by using the state-based theory of program behavior; and then we propose...
Due to the downscaling of transistor feature sizes, nowadays integrated circuits are more vulnerable to various effects that can cause faults during operation. Appropriate mechanisms for handling these faults in the field are required to meet certain dependability demands nonetheless. At the same time, the overhead in chip area and power consumption that is caused by such fault tolerance techniques...
In contrast to applications relying on specialized and expensive highly-available infrastructure, the basic approach of microservice architectures to achieve fault tolerance – and finally high availability – is to modularize the software system into small, self-contained services that are connected via implementation-independent interfaces. Microservices and all dependencies are deployed into self-contained...
Many techniques have been proposed in literature to cope with transient, permanent and malicious faults in computing systems. Among these techniques for reliability improvement and fault tolerance, Control Flow Checking allows covering any fault affecting the part of the storing elements containing the executable program, as well as all the hardware components handling the program itself and its flow...
Critical applications require reliable processors that combine performance with low cost and energy consumption. Very Long InstructionWord (VLIW) processors have inherent resource redundancy not constantly used due to application’s fluctuating Instruction Level Parallelism (ILP). Reliability through idle slots utilization is explored either at compile-time, increasing code size and storage requirements,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.