Wyniki wyszukiwania

Pozycje od 1 do 10 spośród 10 wyników

rozdział

A New Fault Tolerance Heuristic for Scientific Workflows in Highly Distributed Environments Based on Resubmission Impact

K. Plankensteiner, R. Prodan, T. Fahringer

2009 Fifth IEEE International Conference on e-Science > 313 - 320

2009 5th IEEE International Conference on e-Science (e-Science 2009)

Even though highly distributed environments such as Clouds and Grids are increasingly used for e-science high performance applications, they still cannot deliver the robustness and reliability needed for widespread acceptance as ubiquitous scientific tools. To overcome this problem, existing systems resort to fault tolerance mechanisms such as task replication and task resubmission. In this paper...

rozdział

Satisfying Service Level Objectices in a Self-Managing Resource Pool

D. Gmach, J. Rolia, L. Cherkasova

2009 Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems > 243 - 253

2009 Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO)

We consider a self-managing, self-organizing pool of virtualized computer servers that provides infrastructure as a service (IaaS) for enterprise computing workloads. A global controller automatically manages the pool in a top down manner by periodically varying the number of servers used and re-assigning workloads to different servers. It aims to use as few servers as possible to minimize power usage...

rozdział

A Self-Stabilizing O(n)-Round k-Clustering Algorithm

A.K. Datta, S. Devismes, L.L. Larmore

2009 28th IEEE International Symposium on Reliable Distributed Systems > 147 - 155

2009 28th IEEE International Symposium on Reliable Distributed Systems (SRDS)

Given an arbitrary network G of processes with unique IDs and no designated leader, and given a k-dominating set I C G, we propose a silent self-stabilizing distributed algorithm that computes a subset D of I which is a minimal k-dominating set of G. Using D as the set of cluster-heads, a partition of G into clusters, each of radius k, follows. The algorithm is comparison-based, requires O(log n)...

rozdział

A fault-tolerant peer-to-peer object storage architecture with multidimensional range search capabilities and adaptive topology

M.I. Andreica, E.-D. Tirsa, N. Tapus

2009 IEEE 5th International Conference on Intelligent Computer Communication and Processing > 221 - 228

2009 IEEE 5th International Conference on Intelligent Computer Communication and Processing (ICCP)

In this paper we present a fault-tolerant, collaborative peer-to-peer object storage architecture with adaptive topology and efficient multidimensional range search capabilities. Every stored object has a fixed set of index properties, whose ranges of values form a multidimensional geometric property space. The architecture efficiently supports multidimensional range queries by mapping the peer identifiers...

rozdział

Scheduling and Controlling Semantics for Distributed Resource Based Computing Engines

P. Varma, V.K. Naik

2009 Third IEEE International Conference on Secure Software Integration and Reliability Improvement > 47 - 56

2009 Third IEEE International Conference on Secure Software Integration and Reliability Improvement. SSIRI 2009

With the advent of autonomic and cloud computing, computation engines are getting redefined as dynamic configurations of heterogeneous, distributed resources. In this paper, we describe the operational semantics of scheduling and controlling of computation engines configured from component resources subject to dependency and capacity constraints and in accordance with policies and objectives such...

rozdział

Using rough set based multi-checkpointing for fault-tolerance scheduling in economic grids

A. Bouyer, A.H. Abdullah, M.M. Sap

2009 First International Conference on Networked Digital Technologies > 321 - 326

2009 First International Conference on Networked Digital Technologies (NDT 2009)

Fault tolerant Grid scheduling is of vital importance in the Grid computing world. Task replication and checkpointing is two popular methods to achieve a fault tolerant scheduling. Replication method is not an applicable way in economic-based grid computing due to use a large number of resources. The cost of spent time must be paid by consumer for all participant nodes. In this paper, we proposed...

rozdział

Throughput Optimization for Micro-factories Subject to Failures

A. Benoit, A. Dobrila, J.-M. Nicod, L. Philippe

2009 Eighth International Symposium on Parallel and Distributed Computing > 11 - 18

2009 Eighth International Symposium on Parallel and Distributed Computing (ISPDC)

In this paper, we study the problem of optimizing the throughput for micro-factories subject to failures. The challenge consists in mapping several tasks onto a set of machines. The originality of our approach is the failure model for such applications in which tasks are subject to failures rather than machines. If there is exactly one task per machine in the mapping, then we prove that the optimal...

rozdział

Fully Distributed and Fault Tolerant Task Management Based on Diffusions

A. Bui, O. Flauzac, C. Rabat

2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing > 355 - 360

2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing

The task management is a critical component for the computational grids. The aim is to assign tasks on nodes according to a global scheduling policy and a view of local resources of nodes. A peer-to-peer approach for the task management involves a better scalability for the grid and higher fault tolerance. But some mechanisms have to be proposed to avoid the computation of replicated tasks that can...

rozdział

Group-based Coordinated Checkpointing for MPI: A Case Study on InfiniBand

Qi Gao, Wei Huang, M.J. Koop, D.K. Panda

2007 International Conference on Parallel Processing (ICPP 2007) > 47

2007 International Conference on Parallel Processing

As more and more clusters with thousands of nodes are being deployed for high performance computing (HPC), fault tolerance in cluster environments has become a critical requirement. Checkpointing and rollback recovery is a common approach to achieve fault tolerance. Although widely adopted in practice, coordinated checkpointing has a known limitation on scalability. Severe contention for bandwidth...

artykuł

Cyclic Storage for Fault-Tolerant Distributed Executions

R. Marcelm-Jimenez, S. Rajsbaum, B. Stevens

IEEE Transactions on Parallel and Distributed Systems > 2006 > 17 > 9 > 1028 - 1036

Given a set V of active components in charge of a distributed execution, a storage scheme is a sequence B₀, B₁,..., B_b-1 of subsets of V, where successive global states are recorded. The subsets, also called blocks, have the same size and are scheduled according to some fixed and cyclic calendar of b steps. During the ith step, block B_i is selected. Each component takes a copy of its local state and...

Opcje filtrowania

Słowa kluczowe:
FAULT TOLERANT COMPUTING
DATA MINING

Data publikacji

Ustaw własny zakres dat

Typ publikacji

książka (9)
artykuł (1)

Słowa kluczowe

FAULT TOLERANCE (6)
FAULT TOLERANT SYSTEMS (4)
GRID COMPUTING (4)
RESOURCE ALLOCATION (4)
CHECKPOINTING (3)
LOAD BALANCING (3)
COMPUTATIONAL COMPLEXITY (2)
COMPUTATIONAL MODELING (2)
DISTRIBUTED SYSTEMS (2)
FAULT-TOLERANCE (2)
PEER TO PEER COMPUTING (2)
PEER-TO-PEER (2)
PEER-TO-PEER COMPUTING (2)
REPLICATION (2)
RESOURCE MANAGEMENT (2)
TASK REPLICATION (2)
ACTIVE TASK MANAGEMENT (1)
ADAPTIVE CONTROL (1)
ADAPTIVE TOPOLOGY (1)
ALGORITHM DESIGN AND ANALYSIS (1)
APPLICATION PROGRAM INTERFACES (1)
ARBITRARY DISTRIBUTED NETWORK (1)
ATMOSPHERIC MODELING (1)
AUSTRIAN GRID ENVIRONMENT (1)
AUTONOMIC AND CLOUD COMPUTING (1)
AUTONOMIC COMPUTING (1)
BUSINESS DATA PROCESSING (1)
CHECKPOINT/RESTART (1)
CLOCKS (1)
CLOUD COMPUTING (1)
CLUSTER ENVIRONMENTS (1)
CLUSTER-HEAD SET (1)
CLUSTERING ALGORITHMS (1)
COMPLEXITY THEORY (1)
COMPUTABILITY (1)
COMPUTATION ENGINES (1)
COMPUTATIONAL GRID (1)
COMPUTED TOMOGRAPHY (1)
COMPUTER ARCHITECTURE (1)
COMPUTER CRASHES (1)
CONTROLLING SEMANTICS (1)
CRASH-TOLERANT STORAGE (1)
CYCLIC STORAGE SCHEME DESIGN (1)
DATA BACKUP (1)
DATA MINING TOOLKIT (1)
DATABASE INDEXING (1)
DATABASES (1)
DENOTATIONAL SEMANTICS (1)
DEPENDABILITY (1)
DIFFERENTIATED SERVICE (1)
DIFFSERV NETWORKS (1)
DISTRIBUTED ALGORITHMS (1)
DISTRIBUTED APPLICATIONS (1)
DISTRIBUTED DATABASES (1)
DISTRIBUTED ENVIRONMENT (1)
DISTRIBUTED OBJECT DATABASE (1)
DISTRIBUTED PROCESSING (1)
DISTRIBUTED RESOURCE-BASED COMPUTING ENGINES (1)
DISTRIBUTED SYSTEM (1)
DISTRIBUTED SYSTEM FAULT TOLERANCE (1)
DISTRIBUTED SYSTEM STORAGE (1)
DYNAMIC PER-WORKLOAD WEIGHT (1)
DYNAMIC SCHEDULING (1)
DYNAMIC TOPOLOGY (1)
E-SCIENCE (1)
ECONOMIC GRID SCHEDULING (1)
ECONOMIC-BASED GRID COMPUTING (1)
ENGINES (1)
ENTERPRISE COMPUTING WORKLOAD (1)
ENTERPRISE WORKLOAD ANALYSIS (1)
FAILURE MODEL (1)
FAULT CURRENTS (1)
FAULT TOLERANCE HEURISTIC (1)
FAULT TOLERANT TASK MANAGEMENT (1)
FAULT-TOLERANCE SCHEDULING (1)
FAULT-TOLERANT COLLABORATIVE PEER-TO-PEER OBJECT STORAGE ARCHITECTURE (1)
FAULT-TOLERANT DISTRIBUTED EXECUTION (1)
FIXED PER-WORKLOAD SCHEDULING WEIGHTS (1)
FORMAL MODEL (1)
FULLY DISTRIBUTED TASK MANAGEMENT (1)
GAMES (1)
GLOBAL SCHEDULING POLICY (1)
GRAPH PARTITIONING (1)
GRAPH THEORY (1)
GRID SCHEDULER (1)
GROUP-BASED COORDINATED CHECKPOINTING (1)
GROUPWARE (1)
HETEROGENEOUS DISTRIBUTED RESOURCES (1)
HIGH PERFORMANCE COMPUTING (1)
HIGHLY DISTRIBUTED ENVIRONMENTS (1)
INDEX PROPERTY (1)
INDEXES (1)
INFINIBAND (1)
INFRASTRUCTURE AS A SERVICE (1)
K-CLUSTERING (1)
K-DOMINATING SET (1)
LOAD BALANCING AND TASK ASSIGNMENT (1)
więcej

INFONA - portal komunikacji naukowej

Wyniki wyszukiwania

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Typ publikacji

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu