The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Summary form only given. The paper discuss the impact of clouds and grid technology on HPCC using examples from a variety of fields especially the life sciences. It covers the impact of the growing importance of data analysis and note that it is more suitable for these modern architectures than the large simulations (particle dynamics and partial differential equation solution) that are mainstream...
The blade system is very popular in high performance computing. In a blade system, the blade is a fundamental element in which are symmetric multi-processors (SMP). About ten blades constitute a blade box, several blade boxes constitute a cabinet and some cabinets constitute a blade system at last. The blades in a blade box are neighbors because they have relatively short distance. Programmers always...
This paper introduces a new technique to exploit compositions of different data-layout techniques with Hit map, a library for hierarchical-tiling and automatic mapping of arrays. We show how Hit map is used to implement block-cyclic layouts for a parallel LU decomposition algorithm. The paper compares the well-known ScaLAPACK implementation of LU, as well as other carefully optimized MPI versions,...
Improving MPI foundational software to suit multicore systems is a key issue for developing effective parallel software on high performance communication domain. Towards this issue, in this paper, we propose a novel technique, called MPI Accelerator or MPIActor in short, which is a transparent middleware to enhance conventional MPI libraries. The main idea is to optimize MPI routines for multicore...
Algorithmic skeletons encapsulate typical parallel programming patterns such that they can be easily applied by users. Existing skeleton libraries usually work on distributed memory machines. We present an extension of our skeleton library Muesli which now allows to use the same application without modifications on a variety of parallel machines ranging from multi-processor distributed memory to many-core...
The ever-increasing power of high-performance computers and advances in numerical techniques make possible the realistic study of two-phase flow problems in three spatial dimensions. Unfortunately, today, there is often still a gap between the design of numerical algorithms and the characteristics of the hardware on which the algorithms are executed. For the solution of a particular sub problem of...
Predicting performance of parallel applications is becoming increasingly complex and the best performance predictor is the application itself, but the time required to run it thoroughly is a onerous requirement. We seek to characterize the behavior of message-passing applications on different systems by extracting a signature which will allow us to predict what system will allow the application to...
In this paper, we present performance analysis of two NASA applications using performance tools like Tuning and Analysis Utilities (TAU) and SGI MP Inside. MITgcmUV and OVERFLOW are two production-quality applications used extensively by scientists and engineers at NASA. MITgcmUV is a global ocean simulation model, developed by the Estimating the Circulation and Climate of the Ocean (ECCO) Consortium,...
Virtualization technology is currently widely used due to its benefits on high resource utilization, flexible manageability and powerful system security. However, its use for high performance computing (HPC) is still not popular due to the unclearness of the virtualization overheads. It's worthy to evaluate the virtualization cost and to find the performance bottleneck when running HPC applications...
Process placement is a technique widely used on parallel machines with heterogeneous interconnects to reduce the overall communication time. For instance, two processes which communicate frequently are mapped close to each other. Finding the optimal mapping between threads and cores in a shared-memory environment (for example, OpenMP and Pthreads) is an even more complex task due to implicit communication...
Due to the dramatic requirements of 3D games and applications, graphics processing unit (GPU) or general-purpose graphics processing unit (GPGPU) have become required components in the modern computer systems. While these devices enable high parallelism with huge amount of processing elements, the utilization of their capabilities in general scientific applications are still low due to their difficult...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.