2008 14th IEEE International Conference on Parallel and Distributed Systems

Items from 1 to 11 out of 11 results

chapter

Tolerating Temporal Correlated Failures from Cyclic Dependency in High Performance Computing Systems

Xin Chen, Xubin He

2008 14th IEEE International Conference on Parallel and Distributed Systems > 509 - 516

2008 14th IEEE International Conference on Parallel and Distributed Systems

Correlated failures have recently gained more attention in the research of failures in large scale systems. Recent studies have pointed out the negative effect of ignoring such failures when designing a fault tolerant scheme for large scale systems. In this paper, we explore the behaviors of temporal correlated failures arising from cyclic dependency among task nodes via an abstract model. Using this...

chapter

hFT-FW: Hybrid Fault-Tolerance for Cluster-Based Stateful Firewalls

P.N. Ayuso, L. Lefevre, R.M. Gasca

2008 14th IEEE International Conference on Parallel and Distributed Systems > 525 - 532

2008 14th IEEE International Conference on Parallel and Distributed Systems

Failures are a permanent menace for the availability of Internet services. During the last decades, numerous fault-tolerant approaches have been proposed for the wide spectrum of Internet services, including stateful firewalls. Most of these solutions adopt reactive approaches to mask failures by replicating state-changes between replicas. However, reactive replication is a resource consuming task...

chapter

Integrity-Preserving Replica Coordination for Byzantine Fault Tolerant Systems

Wenbing Zhao

2008 14th IEEE International Conference on Parallel and Distributed Systems > 447 - 454

2008 14th IEEE International Conference on Parallel and Distributed Systems

The use of good random numbers is essential to the integrity of many mission-critical systems. However, when such systems are replicated for Byzantine fault tolerance, a serious issue arises, i.e., how do we preserve the integrity of the systems while ensuring strong replica consistency? Despite the fact that there exists a large body of work on how to render replicas deterministic under the benign...

chapter

Constructing Double-Erasure HoVer Codes Using Latin Squares

Wang Gang, Liu Xiaoguang, Lin Sheng, Xie Guangjun, more

2008 14th IEEE International Conference on Parallel and Distributed Systems > 533 - 540

2008 14th IEEE International Conference on Parallel and Distributed Systems

Storage applications are in urgent need of multi-erasure codes. But there is no consensus on the best coding technique. Hafner has presented a class of multi-erasure codes named HoVer codes [1]. This kind of codes has a unique data/parity layout which provides a range of implementation options that cover a large portion of the performance/efficiency trade-off space. Thus it can be applied to many...

chapter

Transparent and Autonomic Rollback-Recovery in Cluster Systems

A. Maloney, A. Goscinski

2008 14th IEEE International Conference on Parallel and Distributed Systems > 541 - 548

2008 14th IEEE International Conference on Parallel and Distributed Systems

Cluster systems provide an excellent environment to run computation hungry applications. However, due to being created using commodity components they are prone to failures. To overcome these failures we propose to use rollback-recovery, which consists of the checkpointing and recovery facilities. Checkpointing facilities have been the focus of many previous studies; however, the recovery facilities...

chapter

An Efficient Disjoint Shortest Paths Routing Algorithm for the Hypercube

Ke Qiu

2008 14th IEEE International Conference on Parallel and Distributed Systems > 43 - 47

2008 14th IEEE International Conference on Parallel and Distributed Systems

We present a routing algorithm that finds n disjoint shortest paths from the source node to n target nodes in the n-dimensional hypercube in O(n³log n)=O(log³NloglogN) time, where N=2ⁿ, provided that such disjoint shortest paths exist which can be checked in O(n^5/2) time, improving the previous O(n⁴) routing algorithm.

chapter

Fault-Tolerant Algoritms for Detecting Event Regions in Wireless Sensor Networks Using Statistical Hypothesis Test

Donglei Cao, Beihong Jin, Jiannong Cao

2008 14th IEEE International Conference on Parallel and Distributed Systems > 631 - 638

2008 14th IEEE International Conference on Parallel and Distributed Systems

Detecting event regions in a monitored environment is a canonical task of wireless sensor networks (WSNs). It is a hard problem because sensor nodes are prone to failures and have scarce energy. In this paper, we seek distributed and localized algorithms for fault-tolerant event region detection. Most existing algorithms only assume that events are spatially correlated, but we argue that events are...

chapter

Area Difference Based Recovery Information Placement for Mobile Computing Systems

Yi-Wei Ci, Zhan Zhang, De-Cheng Zuo, Zhi-Bo Wu, more

2008 14th IEEE International Conference on Parallel and Distributed Systems > 478 - 484

2008 14th IEEE International Conference on Parallel and Distributed Systems

In a mobile computing system, mobile hosts may move around cells, resulting in a considerable cost for locating and retrieving the recovery information, which is necessary for fault tolerance. To speed up the recovery, traditionally, recovery information is migrated according to the location of the mobile host. In this paper, a scheme for efficiently handling the recovery information is proposed....

chapter

A Novel Distributed Index Approach for Service Discovery in MANETs

Yuanfeng Wen, Faen Zhang, Beihong Jin

2008 14th IEEE International Conference on Parallel and Distributed Systems > 415 - 422

2008 14th IEEE International Conference on Parallel and Distributed Systems

Efficiently discovering services in terms of diversified service constraints in a dense MANET is a challenging issue. This paper proposes to build a distributed suffix tree on backbone nodes as XML-based services?? index to provide a concise profile for service descriptions. Moreover, a content-addressable P2P overlay and corresponding fault-tolerance mechanisms are introduced to support the distributed...

chapter

GiFT: Automating FTPA Implementation for MPI Programs

Hongyi Fu, Yunfei Du, Panfeng Wang, Jia Jia, more

2008 14th IEEE International Conference on Parallel and Distributed Systems > 91 - 98

2008 14th IEEE International Conference on Parallel and Distributed Systems

Fault tolerance is a critical issue in the arena of large-scale computing. The fault-tolerant parallel algorithm (FTPA) is an application-level technique for tolerating hardware failures. FTPA achieves fast failure recovery making use of parallel recomputing. However, it complicates the coding of the application program. This paper uses compiler technology to automate the design of FTPA, and introduces...

chapter

Delaunay State Management for Large-Scale Networked Virtual Environments

Chien-Hao Chien, Shun-Yun Hu, Jehn-Ruey Jiang

2008 14th IEEE International Conference on Parallel and Distributed Systems > 781 - 786

2008 14th IEEE International Conference on Parallel and Distributed Systems

Peer-to-Peer (P2P) networks have been proposed as one promising approach to provide better scalability for Networked Virtual Environment (NVE) systems, but P2P-NVE also increases the probability of cheating by allowing users to manage the states of objects. In this paper, we propose Delaunay State Management (DSM), a P2P-NVE state management scheme that divides the whole virtual world into many triangular...

Filter options

Content availability:
Available
Keywords:
FAULT TOLERANCE

Publication date

Set your own date range

Keywords

FAULT TOLERANT SYSTEMS (6)
FAULT TOLERANT COMPUTING (5)
CHECKPOINTING (4)
SERVERS (4)
ALGORITHM DESIGN AND ANALYSIS (3)
FAULT-TOLERANCE (3)
COMPUTATIONAL MODELING (2)
COMPUTER ARCHITECTURE (2)
MESSAGE PASSING (2)
MOBILE COMPUTING (2)
PEER-TO-PEER COMPUTING (2)
ROUTING (2)
AD HOC NETWORKS (1)
ALGORITHMS (1)
ANTI-CHEATING (1)
APPLICATION PROGRAM INTERFACES (1)
AREA DIFFERENCE BASED RECOVERY INFORMATION PLACEMENT (1)
ARRAYS (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTHORISATION (1)
AVAILABILITY (1)
BANDWIDTH RESOURCES (1)
BAYES METHODS (1)
BAYESIAN TECHNIQUE (1)
BIPARTITE GRAPH (1)
BYZANTINE FAULT TOLERANCE (1)
BYZANTINE FAULT TOLERANT SYSTEMS (1)
CLUSTER SYSTEM ROLLBACK-RECOVERY (1)
CLUSTER SYSTEMS (1)
CLUSTER-BASED STATEFUL FIREWALL (1)
CODES (1)
COLUMN-HAMILTONIAN LATIN SQUARES (1)
COMBINATORIAL MATHEMATICS (1)
COMBINATORIAL REPRESENTATION (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTATIONAL RESOURCES (1)
COMPUTER NETWORKS (1)
COMPUTERS (1)
CONSTRUCTION INDUSTRY (1)
CONTENT-ADDRESSABLE P2P OVERLAY (1)
CORRELATED FAILURES (1)
CORRELATION (1)
CYCLIC DEPENDENCY (1)
DATA INTEGRITY (1)
DATA-FLOW ANALYSIS (1)
DELAUNAY (1)
DELAUNAY STATE MANAGEMENT (1)
DEPENDENCY (1)
DETECTION ALGORITHMS (1)
DISC STORAGE (1)
DISJOINT SHORTEST PATHS ROUTING ALGORITHM (1)
DISK STORAGE (1)
DISTANCE MEASUREMENT (1)
DISTRIBUTED ALGORITHMS (1)
DISTRIBUTED FIREWALLS (1)
DISTRIBUTED INDEX APPROACH (1)
DISTRIBUTED PROCESSING (1)
DISTRIBUTED SUFFIX TREE (1)
DISTRIBUTED-LOCALIZED ALGORITHMS (1)
DIVERSIFIED SERVICE CONSTRAINTS (1)
DOUBLE-ERASURE HOVER CODE CONSTRUCTION (1)
ENTROPY (1)
ERASURE CODE (1)
ERROR DETECTION (1)
EVEN-ORDER LATIN SQUARES (1)
EVENT REGION DETECTION (1)
FAILURE PROPAGATION (1)
FAILURE RECOVERY (1)
FAULT LOCATION (1)
FAULT RECOGNITION CAPABILITY (1)
FAULT TOLERANT SCHEME (1)
FAULT TOLERANT SYSTEM (1)
FAULT-TOLERANCE MECHANISMS (1)
FAULT-TOLERANT EVENT REGION DETECTION (1)
FAULT-TOLERANT PARALLEL ALGORITHM (1)
FIRES (1)
GAMES (1)
GENERATORS (1)
GENESIS SYSTEM (1)
GET IT FAULT-TOLERANT (1)
GIFT (1)
GROUP-BASED COORDINATED CHECKPOINTING (1)
HAMMING WEIGHT (1)
HARDWARE (1)
HARDWARE FAILURES TOLERANCE (1)
HIGH AVAILABILITY (1)
HIGH PERFORMANCE COMPUTING SYSTEMS (1)
HIGH-SPEED LOW-LATENCY NETWORKS (1)
HOVER CODE (1)
HYBRID FAULT TOLERANCE (1)
HYPERCUBE NETWORKS (1)
HYPERCUBES (1)
INDEPENDENT CHECKPOINTING TECHNIQUE (1)
INDEXES (1)
INSTRUMENTS (1)
INTEGRITY-PRESERVING REPLICA COORDINATION (1)
INTERNET (1)
INTERNET SERVICE AVAILABILITY (1)
LAN (1)
more

INFONA - science communication portal

2008 14th IEEE International Conference on Parallel and Distributed Systems $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2008 14th IEEE International Conference on Parallel and Distributed Systems