The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The next generation of supercomputers will probably need large amounts of parallelism, both for generating the needed computing power and for masking memory latency. Furthermore, it is necessary to expand the use of parallelism to less regular programs than is usually found in numerical applications. The main obstacle to be overcome is the presence of control dependences, i.e. of situations in which...
The progress of science involves a constant interplay between diversification and unification. Diversification extends the boundaries of science to cover new and wider ranges of phenomena; successful unification reveals that a range of experimentally validated theories are no more than particular cases of some more general principle. The cycle continues when the general principle reveals further directions...
Irregular computations pose some of the most interesting and challenging problems in automatic parallelization. Irregularity appears in certain kinds of numerical problems and is pervasive in symbolic applications. Such computations often use dynamic data structures which make heavy use of pointers. This complicates all the steps of a parallelizing compiler, from independence detection to task partitioning...
We survey strategies for distributing shared objects in large parallel and distributed systems. Examples of such objects are global variables in a parallel program, pages or cache lines in a virtual shared memory system, shared files in a distributed file system, and videos and pictures in a distributed multimedia server. We focus on strategies for distributing, accessing, and (consistently) updating...
The numerical solution of partial differential equations leads to large, sparse systems of equations with up a several millions of unknowns. Fast iterative algorithms for the solution of these systems are typically based on the multilevel principle. Unfortunately, some of the commonly used programming techniques lead to a high overhead on many advanced computer architectures. The two main sources...
Performance tuning of applications for shared-memory multiprocessors is to a great extent concerned with removal of performance bottlenecks caused by communication among the processors. To simplify performance tuning, our approach has been to extend the hardware/software interface with powerful memory-control primitives in combination with compiler optimizations to remove communication bottlenecks...
The nova radial plot in the *Graph system is used to display communication in data-parallel programs. Nova visualizations play a valuable role in understanding how virtual communication in data-parallel programs translates into network communication, and how source code modifications and data layout change physical communication patterns. A case study demonstrates how *Graph nova views are used to...
This paper describes a performance evaluation technique of parallel programs based on software tracing. The interest of the proposed method is to enable post-mortem correction of the intrusion of software tracing of non deterministic programs (probe effect), by use of record-replay debugging techniques. In the first phase (record), a primary trace is collected with a very low perturbation. This primary...
The needs for larger problem sizes and for more accurate results force the users in the field of scientific computing towards applying parallel machines. Besides problems with initial program development another hard task arises with parallel program debugging, where severe difficulties appear with nondeterminism and race conditions. This paper describes the tools ATEMPT and CDFA, two modules...
The relationship between client-server distributed computing and message-passing parallel processing is explored in this work through an experimental RPC framework for the PVM system. The project investigates the potential for RPC to complement asynchronous message passing in PVM - both to expand the domain of applications, and to evaluate the effectiveness of client-server computing for traditional...
This paper introduces Exdasy, a user-friendly and extendable software tool for partitioning unstructured meshes and mapping mesh partitions to parallel computers. Exdasy was designed to meet the increasing demands to today's data distribution systems, which are posed by the variety of mesh computations, the ongoing development of distribution algorithms and rapid changes in parallel hardware technology...
We present an infrastructure for building parallel applications by interconnecting slightly modified pre-existing parallel components. This infrastructure (called PHIS) allows the cooperation of components that run in different parallel machines. In succession, we describe the rationale behind PHIS, the primitives used to interconnect the application components and its internal architecture and we...
This paper describes an integrated graphical toolset for performance-oriented design of portable parallel software. The toolset consists of a graphical design tool based on the PVM communications library for building parallel algorithms, a simulation engine and a visualisation tool for animation of program execution and visualisation of platform and network performance measures and statistics. The...
Process migration is one technique to implement environments that perform automatic load balancing. However on networks of workstations the load indices and heuristics that are used must respect the load that is imposed on the system by other users' processes. In this paper we suggest an approach that uses an existing process migration component to construct an automatic load balancing system for...
Currently, PVM constitutes a widely used software for developing parallel applications in workstation and parallel environments. In this paper we propose a processors management system for PVM which allows to assign the PVM tasks over a computers system. The Processors Management System uses two task assignment heuristics. These heuristics are based on Neural Networks and Genetic Algorithms.
This paper reports on constructing an exhaustive full program control flow framework for precise data flow analysis of real programs. We discuss the problem of ambiguous calling relations in the presence of function pointers. A flow insensitive analysis is suggested and implemented for real C programs.
The total-exchange is one of the most dense communication patterns and is at the heart of numerous applications and programming models in parallel computing. In this paper we present a simple randomized algorithm to efficiently schedule the total-exchange on.a toroidal mesh with wormhole switching. This algorithm is based on an important property of the wormhole networks that reach high performance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.