The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The term “the Grid” was coined in the mid-1990s to denote a proposed distributed computing infrastructure for advanced science and engineering [4]. Considerable progress has since been made on the construction of such an infrastructure (e.g., [1,6,7) but the term “Grid” has also been conflated, at least in popular perception, to embrace everything from advanced networking to artificial intelligence...
A software component framework is one where an application designer programs by composing well understood and tested “components”e rather than writing large volumes of not-very-reusable code. The software industry has been using component technology to build desktop applications for about ten years now. More recently this idea has been extended to application in distributed systems with frameworks...
Large memories have become an affordable storage medium for databases involving hundreds of Gigabytes on multi-processor systems. In this short note, we review our research on building relational engines to exploit this major shift in hardware perspective. It illustrates that key design issues related to parallelism poses architectural problems at all levels of a system architecture and whose impact...
Throughout the history of computer implementation, the technologies employed for logic to build ALUs and the technologies employed to realize high speed and high-density storage for main memory have been disparate, requiring different fabrication techniques. This was certainly true at the beginning of the era of electronic digital computers where logic was constructed from vacuum tubes and main memory...
Today networking, distributed computing, and parallel computation research have matured to make it possible for distributed systems to support high-performance applications, but: Resources are dispersed, Connectivity is variable, Dedicated access is not possible. In this talk we advocate the‘Computational Grids’ to support ‘large-scale’ applications. These must provide transparent access to...
Parallel computing is a key technology for many areas in science and industry. Outstanding examples are the ASCI and Blue Gene programs that target only very few but critical applications. A much broader spectrum of applications can be found on any of the machines of supercomputing centers all over the world.
Performance nalysis nd tuning of parallel/distributed applications re very difficult tasks for non-expert programmers. It is necessary to provide tools that utomatically carry out these tasks. Many pplications have different behavior ccording to the input data set or even change their behavior dynamically during the execution. Therefore, it is necessary that the performance tuning can be done on the...
Distributed-system observation tools require an efficient data structure to store and query the partial-order of execution. Such data structures typically use vector timestamps to efficiently answer precedence queries. Many current vector-timestamp algorithms either have a poor time/space complexity tradeoff or are static. This limits the scalability of such observation tools. One algorithm, centralized...
Many multiprocessor systems are based on distributed shared memory. It is often important to statically bind threads to processors in order to avoid remote memory access, due to performance. Finding a good allocation takes long time and it is hard to know when to stop searching for a better one. It is sometimes impossible to run the application on the target machine. The developer needs a tool that...
Efficient scheduling of task graphs for parallel machines is a major issue in parallel computing. Such algorithms are often hard to understand and hard to evaluate. We present a framework for the visualization of scheduling algorithms. Using the LogP cost model for parallel machines, we simulate the effects of scheduling algorithms for specific target machines and task graphs before performing time...
This paper presents the design, implementation and experimental evaluation of DIOS, an infrastructure for enabling the runtime monitoring and computational steering of parallel and distributed applications. DIOS enables existing application objects (data structures) to be enhanced with sensors and actuators so that they can be interrogated and controlled at runtime. Application objects can be distributed...
A metasystem allows seamless access to a collection of distributed computational resources. Checkpointing is an important service in high throughput computing, especially for process migration and recovery after system crash. This article describes the experiences on incorporating checkpointing and recovery facilities in a Java-based metasystem. Our case study is suma, a metasystem for execution of...
This paper describes an optimised MPI library for the T3E.1 Previous versions of MPI for the T3E were built on top of the SHMEM interface. This paper describes an optimised version that also uses additional capabilities of the low-level communication hardware.
The performance of parallel and distributed systems and applications — its evaluation, analysis, and optimization — is at once a fundamental topic for research investigation and a technological problem that requires innovations in tools and techniques to keep pace with system and application evolution. This dual view of performance “science” and performance “technology” jointly spans broad fields...
We consider a networking subsystem for message-passing clusters that uses two unidirectional queues for data transfers between the network interface card (NIC) and the lower protocol layers, with polling as the primary mechanism for reading data off these queues. We suggest that for accurate mathematical analysis of such an organization, the values of the system’s states probabilities have to be taken...
The BSP model can be extended with a zero cost synchronization mechanism, which can be used when the number of messages due to receives is known. This mechanism, usually known as“oblivious synchronization” implies that different processors can be in different supersteps at the same time. An unwanted consequence of this software improvement is a loss of accuracy in prediction. This paper proposes an...
Current analytic solutions to the execution time prediction Y of binary parallel compositions of tasks with arbitrary execution time distributions X1 and X2 are either computationally complex or very inaccurate. In this paper we introduce an analytical approach based on the use of lambda distributions to approximate execution...
Performance analysis and prediction is an important factor determining the efficiency of parallel programs. Considerable efforts have been made both in pure theoretical analysis and in practical automatic profiling. Unfortunately, contributions in one area seem to ignore the results of the other.We introduce a general performance prediction methodology based on the integration of analytical models...
In this paper we present the Hardware Performance Monitor (HPM) Toolkit, a language independent performance analysis and visualization system developed for performance measurements of applications running on the IBM Power 3 with AIX and on Intel clusters with Linux. The HPM Toolkit supports analysis of applications written in Fortran, C, and C++. It was designed to collect hardware events with low...
As the technology for high-speed networks has evolved over the last decade, the interconnection of commodity computers (e.g., PCs and workstations) at gigabit rates has become a reality. However, the improved performance of high-speed networks has not been matched so far by a proportional improvement in the ability of the TCP/IP protocol stack. As a result the Virtual Interface Architecture (VIA)...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.