The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
When designing a Multi-Processor System-on-Chip (MPSoC), a very large range of design alternatives arises from a huge space of possible design options and component choices. Literature proposes numerous Design-Space-Exploration (DSE) approaches thats mainly focus on cost optimization. In this paper, we present a DSE approach which focuses on the reliability of the whole design. This approach is based...
Because of hardware faults, the situation that the processor cannot perform properly is occurred frequently in large scale software-intensive systems. Most of traditional fault-tolerant methods do not distinguish the type of hardware failure. In view of this, we propose self-repairing software architecture for predictable hardware faults. By introducing computational reflection, the software architecture...
These two issues are addressed in this paper: 1) the formal definitions of the concepts relevant to program faults, and 2) the comparison and classification of program faulttolerant abilities. We firstly analyze the subtle differences among these basic concepts: faults, errors and failures, and represent their formal definitions by using the state-based theory of program behavior; and then we propose...
This paper discusses SEE effects in an architecture based on commercial-off-the-shelf multicore processors for consolidating mixed criticalities applications in single board computers for space applications. This paper builds on previously proposed system-level architectures for mixed-criticality applications, describing them both for convenience together with the previous validation results. The...
The current developments of software defined networking (SDN) paradigm provide a flexible architecture for network control and management, in the cost of deploying new hardwares by replacing the existing routing infrastructure. Further, the centralized controller architecture of SDN makes the network prone to single point failure and creates performance bottleneck. To avoid these issues and to support...
Due to the advent of active safety features and automated driving capabilities, the complexity of embedded computing systems within automobiles continues to increase. Such advanced driver assistance systems (ADAS) are inherently safetycritical and must tolerate failures in any subsystem. However, fault-tolerance in safety-critical systems has been traditionally supported by hardware replication, which...
The problem of software fault-tolerance is described. The fault-tolerance problem is considered as hardware faults and software errors. The software errors classification is proposed. Authors describe the computational process as treelike directed graph. Errors are bringing in the realisation of the algorithm at the stage of programming. It is cause forming “real” algorithm instead of its “theoretical”...
This paper presents a theoretical comparison of different existing data error detection techniques. The techniques are compared by fault coverage, memory overhead and performance overhead. For this comparison, ten different data error detection techniques are taken into account. In general, the best error detection technique always has the highest fault coverage with low performance and memory overhead...
This work focuses on the development of models, methods, and tools to increase a fault tolerance of highperformance computing systems. The described models and methods are based on automatic diagnostics of the basic software and hardware components of these systems, the use of automatic localization, correction of faults, and the use of automatic HPC-system reconfiguration mechanisms. The originality...
Newest technologies of integrated circuits fabrication allow billions of transistors arranged in a single chip enabling to implement a complex parallel system, which requires a high scalable and parallel communication architecture, such as a Network-on-Chip (NoC). These technologies are very close to physical limitations increasing faults in manufacture and at runtime. Thus, it is essential to provide...
Systems are expected to evolve during their service life in order to cope with changes of various natures, ranging from fluctuations in available resources to additional features requested by users. For dependable embedded systems, the challenge is even greater, as evolution must not impair dependability attributes. Resilient computing implies maintaining dependability properties when facing changes...
Mobile applications are a part of human life, ranging from simple tasks such as e-mails to critical operations such as security surveillances. Referable to the different softwares and hardwares used in mobile devices, failure of a mobile application is unavoidable. Failure of mobile applications poses a serious threat to the success of a mobile software. Also, those failures can result in a great...
The advent of software-based fault tolerance presents a rare opportunity to create a new paradigm for support equipment architecture. This test system must be capable of servicing the development, integration, and test of hardware and software, allowing developers remote access to the units under test (UUT) throughout the integration and test process. Using mainly low-cost commercial off the shelf...
As the fault frequency is increasing with the component count in modern and future computer systems, resilience becomes increasingly critical. Existing work on anomaly detection and fault prediction enables failure avoidance techniques to circumvent fault effects proactively. In addition, traditional fault tolerance techniques can be applied to handle faults reactively. Different types of faults may...
As hardware components are expected to become ever more unreliable due to the technology scaling, hardware errors have become unavoidable. Dependable systems that rely on a correct functionality often use redundancy to detect such hardware faults during operation. However, to design costefficient reliable systems, it is crucial to effectively exploit the available redundancy. Thus, researchers have...
For the High Performance Linpack (HPL) benchmark at the coming Exascale and beyond, silent errors like bit flipping in memory are expected to become inevitable. However, since bit flipping errors are difficult to be detected and located, their impact to the numerical correctness of HPL has not been evaluated thoroughly and quantitatively, while the impact at Exascale is especially susceptible. In...
Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by...
The virtualization technology has been widely used in today's doud computing datacenters. With the virtualization technology, each physical machine in a datacenter can be logically divided into several virtual machines, on which different types of software services can host. However, many reasons may decrease the availability of the whole system. For example, a failed physical machine automatically...
To achieve better performance, computer designers employ advanced techniques that shrink feature sizes, lower supply voltage, increase clock rates and memory capacity, and meanwhile modern computers become increasingly vulnerable to soft errors caused by energetic particles, such as alpha particles and neutron strikes. Therefore, fault tolerance evolves into one of the most significant design objectives,...
In safety-critical environments it is no longer sufficient to rely on legacy methodologies. Correctness should be built in all the way through the process. This paper presents a toolchain which allows theorem prover output to be interfaced to fault-tolerant FPGA circuitry. We show a shallow embedding of a lambda calculus executing on a Xilinx platform with the assistance of a choice of fault-tolerance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.