High-Performance Computing and Networking
7th International Conference, HPCN Europe 1999 Amsterdam, The Netherlands, April 12–14, 1999 Proceedings

Peter Sloot, Marian Bubak, Alfons Hoekstra, Bob Hertzberger

Items from 61 to 80 out of 155 results

chapter

Deadlock prevention in incremental replay of message-passing programs

Franco Zambonelli

Lecture Notes in Computer Science > High-Performance Computing and Networking > 593-602

To support incremental replay of message-passing applications, processes must periodically checkpoint and must log some of the messages. The paper shows that known adaptive logging algorithms are likely to introduce deadlocks in replay and presents a new algorithm that prevents deadlocks and achieves better performance.

chapter

Remote and concurrent process duplication for SPMD based parallel processing on COWs

M. Hobbs, A. Goscinski

Lecture Notes in Computer Science > High-Performance Computing and Networking > 603-612

The increasing popularity of a Cluster of Workstations (COW) for the execution of parallel applications can be attributed to its impressive price to performance ratio. Unfortunately, currently available software to manage the execution of parallel applications on COWs do not provide satisfactory levels of performance, nor do they provide the application developer with a friendly programming environment...

chapter

Using BSP to optimize data distribution in skeleton programs

Andrea Zavanella, Susanna Pelagatti

Lecture Notes in Computer Science > High-Performance Computing and Networking > 613-622

Parallel programming can be made easier by means of a skeleton based methodology, such as P ³ L, which helps programmers to compose their applications by using a set of fixed parallel patterns. Such kind of approach is also useful to obtain portability because the “structured” nature of the language can be used to devise a composable support for each parallel pattern so that the...

chapter

Swiss-Tx communication libraries

Stephan Brauss, Martin Frey, Anton Gunzinger, Martin Lienhard, more

Lecture Notes in Computer Science > High-Performance Computing and Networking > 623-632

The goal of the Swiss-Tx project is to develop, build and install a series of new supercomputers which are mostly based on commodity parts. Only the communication devices and the communication libraries are custom because available products (e.g. Ethernet with the standard socket interface) do not offer the necessary functionality, bandwidth and latency. This paper presents the high-performance communication...

chapter

Finding the optimal unroll-and-jam

N. Zingirian, M. Maresca

Lecture Notes in Computer Science > High-Performance Computing and Networking > 633-642

Reducing the traffic between CPU and main memory is one of the main issues in the optimization of programs for load/store architectures. It is the register allocation module of optimizing compliers that keeps this traffic low by cleverly associating the program variables to the CPU registers. Since register allocation takes place during code generation and works on the intermediate code produced by...

chapter

A linker for effective whole-program optimizations

Andrea G. M. Cilio, Henk Corporaal

Lecture Notes in Computer Science > High-Performance Computing and Networking > 643-652

The use of a standard binary format in the later part of code generation promotes efficiency and interchangeability of tools, but leaves little information on the source file in the machine code representation. We propose a new approach to code generation, based on a single, highly structured internal format used during proper compilation, machine code generation and linkage. This format offers new...

chapter

The Nestor library: A tool for implementing fortran source to source transfromations

Georges-André Silber, Alain Darte

Lecture Notes in Computer Science > High-Performance Computing and Networking > 653-662

We describe Nestor, a library to easily manipulate Fortran programs through a high level internal representation based on C++ classes. Nestor is a research tool that can be used to quickly implement source to source transformations. The input of the library is Fortran 77, Fortran 90 and HPF 2.0. Its current output supports the same languages plus some dialects such as Petit, OpenMP, CrayMP. Compared...

chapter

Performance measurements on sandglass-type parallelization of doacross loops

Motoyasu Takabatake, Hiroki Honda, Toshitsugu Yuba

Lecture Notes in Computer Science > High-Performance Computing and Networking > 663-672

In this paper, we propose the sandglass-type parallelization technique for a doacross loop which has the characteristics of iterationbased parallelizing and software pipelining. We prove its effectiveness by comparing the sandglass-type to well-known three parallelization techniques: iteration-based, software pipelining, and a combination of doalltype parallel and sequential techniques. We conclude...

chapter

Transforming and parallelizing ANSI C programs using pattern recognition

Maarten Boekhold, Ireneusz Karkowski, Henk Corporaal

Lecture Notes in Computer Science > High-Performance Computing and Networking > 673-682

Code transformations are a very effective method of parallelizing and improving the efficiency of programs. Unfortunately most compiler systems require implementing separate (sub-)programs for each transformation. This paper describes a different approach. We designed and implemented a fully programmable transformation engine. It can be programmed by means of a transformation language. This language...

chapter

Centralized architecture for parallel query processing on networks of workstations

Sijun Zeng, Sivarama P. Dandamudi

Lecture Notes in Computer Science > High-Performance Computing and Networking > 683-692

Network of workstations (NOW) is a cost-effective alternative to a multiprocessor system. Here we propose a centralized architecture for parallel query processing on network of workstations. We describe a three-level processing strategy and evaluate its performance. The top two levels use a space-sharing technique to assign a partition to a query. The third-level uses a chunk-based load sharing policy...

chapter

Object-oriented database system for large-scale molecular dynamics simulations

Jacek Kitowski, Dariusz Wajs, Piotr Trzeciak

Lecture Notes in Computer Science > High-Performance Computing and Networking > 693-701

In the paper a model of the object-oriented database system is presented for archiving results generated with particles simulations and for retrieving simulation results from the database system for further processing.

chapter

Virtual engineering of multi-disciplinary applications and the significance of seamless accessibility of geometry data

Vaibhav Deshpande, Luciano Fornasier, Edgar A. Gerteisen, Nils Hilbrink, more

Lecture Notes in Computer Science > High-Performance Computing and Networking > 702-712

The concept of virtual engineering (VEng) can be understood as a generalization of “multi-disciplinary problem solving”, an ever more used term in scientific computing. An abstract space consisting of the physical, the geometrical, and the cost function directions, called CGP, is introduced. The VEng problem can be seen as a complex manifold embedded in this space. Common standard data formats, unified...

chapter

Some results from a new technique for response time estimation in parallel DBMS

Neven Tomov, Euan Dempster, M. Howard Williams, Albert Burger, more

Lecture Notes in Computer Science > High-Performance Computing and Networking > 713-721

The need for tools for performance prediction of parallel database systems is generally recognised. One such tool which has been developed (Steady) is based on analytical techniques to obtain a rapid estimate of performance. The approach to predicting response time involves a heuristic approximation coupled with standard queueing solutions. This paper reports on preliminary results for both maximum...

chapter

PastSet—A distributed structured shared memory system

Brian Vinter, Otto J. Anshus, Tore Larsen

Lecture Notes in Computer Science > High-Performance Computing and Networking > 722-731

The architecture and performance of a structured distributed shared memory system, PastSet, is described. The PastSet abstraction allows programmers to write applications that run efficiently on different architectures from four-way SMP nodes to larger clusters. PastSet is a tuple-based three-dimensional structured distributed shared memory system, which provides the programmer with operations to...

chapter

Optimal scheduling of iterative data-flow programs onto multiprocessors with non-negligible interprocessor communication

D. Antony Louis Piriyakumar, Paul Levi, C. Siva Ram Murthy

Lecture Notes in Computer Science > High-Performance Computing and Networking > 732-743

The problem of optimal compile-time multiprocessor scheduling of iterative data-flow programs with feedback (delay elements) is addressed in this paper, unlike the earlier studies assumed the availability of a large number of processors and complete interconnection among them along with the interprocessor communication (IPC) to be non-negligible to be more realistic. We first explain the effects of...

chapter

Overlapping communication with computation in distributed object systems

Françoise Baude, Denis Caromel, Nathalie Furmento, David Sagnol

Lecture Notes in Computer Science > High-Performance Computing and Networking > 744-753

In the framework of distributed object systems, this paper presents the concepts and an implementation of an overlapping mechanism between communication and computation. This mechanism allows to decrease the execution time of a remote method invocation.

chapter

Exploiting speculative thread-level parallelism on a SMT processor

Pedro Marcuello, Antonio González

Lecture Notes in Computer Science > High-Performance Computing and Networking > 754-763

In this paper we present a run-time mechanism to simultaneously execute multiple threads from a sequential program on a simultaneous multithreaded (SMT) processor. The threads are speculative in the sense that they are created by predicting the future control flow of the program. Moreover, threads are not necessarily independent. Data dependences among simultaneously executed threads may exist. To...

chapter

Network interface active messages for low overhead communication on SMP PC clusters

Motohiko Matsuda, Yoshio Tanaka, Kazuto Kubota, Mitsuhisa Sato

Lecture Notes in Computer Science > High-Performance Computing and Networking > 764-773

NICAM is a communication layer for SMP PC clusters connected via Myrinet, designed to reduce overhead and latency by directly utilizing a micro-processor equipped on the network interface. It adopts remote memory operations to reduce much of the overhead found in message passing. NICAM employs an Active Messages framework for flexibility in programming on the network interface, and this flexibility...

chapter

Experimental results about MPI collective communication operations

Massimo Bernaschi, Giulio Iannello, Mario Lauria

Lecture Notes in Computer Science > High-Performance Computing and Networking > 774-783

Collective communication performance is critical in a number of MPI applications, yet relatively few results are available to assess the performance of mainstream MPI implementations. In this paper we focus on two widely used primitives, broadcast and reduce, and present experimental results for the Cray T3E and the IBM SP2. We compare the performance of the existing MPI primitives with our implementation...

chapter

MaDCoWS: A scalable distributed shared memory environment for massively parallel multiprocessors

Dimitris Dimitrelos, Constantine Halatsis

Lecture Notes in Computer Science > High-Performance Computing and Networking > 784-793

In this paper we present MaDCoWS, a software implementation of a Distributed Shared Memory (DSM) runtime system, specifically designed for massively parallel 2-D grid multiprocessors. The system takes advantage of the network topology in order to minimise the paths of the message sequences realising the shared operations. As a result its performance is increased and the system becomes scalable even...

Series:
Lecture Notes in Computer Science

Publication date

Set your own date range

Keywords

ALGORITHM SELECTION (1)
APPROXIMATE SUBDOMAIN SOLUTION (1)
CACHE MEMORY (1)
CLASS INHERITANCE (1)
DATA-FLOW PROGRAMS (1)
DISTRIBUTED OBJECTS (1)
EXCESSIVE DEFERMENT (1)
FUZZY LOGIC (1)
HIGH PERFORMANCE JAVA (1)
IMAGE CONVOLUTION (1)
INTERPROCESSOR COMMUNICATION (1)
INTRA AND INTER ITERATION PRECEDENCES (1)
ITERATION BOUND (1)
ITERATION PERIOD (1)
ITERATIVE METHODS (1)
JAVA GRANDE (1)
LOAD/STORE ARCHITECTURES (1)
MEDICAL APPLICATION (1)
MEMORY HIERARCHY (1)
MULTIDIMENSIONAL DATA ANALYSIS (1)
MULTIPROCESSOR SCHEDULES (1)
NETWORK OF WORKSTATIONS (1)
NEURAL NETWORKS (1)
OPERATING SYSTEMS (1)
OPTIMAL SCHEDULING (1)
OPTIMIZATION PROBLEM (1)
ORTHOGONALIZATION METHODS (1)
PARALLEL ALGORITHMS (1)
PARALLEL COMPUTING (1)
PARALLEL KINETIC MODELING (1)
PARALLEL KRYLOV SUBSPACE METHODS (1)
PARALLEL PROCESSING ON COWS (1)
PERFORMANCE EVALUATION (1)
POSITRON EMISSION TOMOGRAPHY (1)
PRACTICAL PERFORMANCE PREDICTION (1)
PVM (1)
REGISTER UTILIZATION (1)
REMOTE AND CONCURRENT PROCESS DUPLICATION (1)
SPARSE MATRIX (1)
SPATIAL LOCALITY (1)
STATIC POLYMORPHISM (1)
TEMPORAL LOCALITY (1)
UNFOLDING AND RETIMING... (1)
UNROLL-AND-JAM (1)
more

INFONA - science communication portal

High-Performance Computing and Networking
7th International Conference, HPCN Europe 1999 Amsterdam, The Netherlands, April 12–14, 1999 Proceedings

Deadlock prevention in incremental replay of message-passing programs

Remote and concurrent process duplication for SPMD based parallel processing on COWs

Using BSP to optimize data distribution in skeleton programs

Swiss-Tx communication libraries

Finding the optimal unroll-and-jam

A linker for effective whole-program optimizations

The Nestor library: A tool for implementing fortran source to source transfromations

Performance measurements on sandglass-type parallelization of doacross loops

Transforming and parallelizing ANSI C programs using pattern recognition

Centralized architecture for parallel query processing on networks of workstations

Object-oriented database system for large-scale molecular dynamics simulations

Virtual engineering of multi-disciplinary applications and the significance of seamless accessibility of geometry data

Some results from a new technique for response time estimation in parallel DBMS

PastSet—A distributed structured shared memory system

Optimal scheduling of iterative data-flow programs onto multiprocessors with non-negligible interprocessor communication

Overlapping communication with computation in distributed object systems

Exploiting speculative thread-level parallelism on a SMT processor

Network interface active messages for low overhead communication on SMP PC clusters

Experimental results about MPI collective communication operations

MaDCoWS: A scalable distributed shared memory environment for massively parallel multiprocessors

Filter options

Publication date

Keywords

INFONA - science communication portal

High-Performance Computing and Networking 7th International Conference, HPCN Europe 1999 Amsterdam, The Netherlands, April 12–14, 1999 Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

High-Performance Computing and Networking
7th International Conference, HPCN Europe 1999 Amsterdam, The Netherlands, April 12–14, 1999 Proceedings