The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Scientists who want to exploit the computing power of the latest parallel architectures are faced with a diverse set of architectures and a number of programming languages, models and approaches. Among several such programming techniques are directive-based programming models, OpenMP and OpenACC. This paper explores the similarities and the functionality gaps between both models and presents insights...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
Programming accelerators today usually requires managing separate virtual and physical memories, such as allocating space in and copying data between host and device memories. The OpenACC API provides data directives and clauses to control this behavior where it is required. This paper describes how the data model is supported in current OpenACC implementations, ranging from research compilers (OpenUH...
As the HPC and Big Data communities continue to converge, heterogeneous and distributed systems are becoming commonplace. In order to take advantage of the immense computing power of these systems, distributing data efficiently and leveraging specialized hardware (e.g. accelerators) is critical. MapReduce is a popular paradigm that provides automatic data distribution to the programmer. CUDA and OpenCL...
The Fast Fourier Transform (FFT) is one of the most important numerical tools widely used in many scientific and engineering applications. The algorithm performs O(nlogn) operations on n input data points in order to calculate only small number of k large coefficients, while the rest of n - k numbers are zero or negligibly small. The algorithm is clearly inefficient, when n points input data lead...
Heterogeneous multicore embedded systems are rapidly growing with cores of varying types and capacity. Programming these devices and exploiting the hardware has been a real challenge. The programming models and its execution are typically meant for general purpose computation, they are mostly too heavy to be adopted for the resource-constrained embedded systems. Embedded programmers are still expected...
Multicore embedded systems are rapidly emerging. Hardware designers are packing more and more features into their design. Introducing heterogeneity in these systems, i.e. Adding cores of varying types does provide opportunities to solve problems in different aspects. However, this presents several challenges to embedded system programmers since software is still not mature enough to efficiently exploit...
Accelerators offer the potential to significantly improve the performance of scientific applications when offloading compute intensive portions of programs to the accelerators. However, effectively tapping their full potential is difficult owing to the programmability challenges faced by the users when mapping computation algorithms to the massively parallel architectures such as GPUs.Directive-based...
Directive-based programming models provide high-level of abstraction thus hiding complex low-level details of the underlying hardware from the programmer. One such model is OpenACC that is also a portable programming model allowing programmers to write applications that offload portions of work from a host CPU to an attached accelerator (GPU or a similar device). The model is gaining popularity and...
Energy efficiency of GPUs has facilitated the usage of GPUs in many complex scientific applications. Nodes with multi-GPUs along with multi-core CPUs are quite common in today's HPC landscape. This gives the flexibility to utilize CPUs or accelerators or even both according to the workload characteristics. It is not possible to measure power and energy accurately in all the cases, an alternate approach...
Heterogeneous computing come with tremendous potential and is a leading candidate for scientific applications that are becoming more and more complex. Accelerators such as GPUs whose computing momentum is growing faster than ever offer application performance when compute intensive portions of an application are offloaded to them. It is quite evident that future computing architectures are moving...
Accelerators have been considered a viable way by many scientific and technical programmers to program and accelerate huge scientific applications. Accelerators such as GPUs have immense potential in terms of high compute capacity but programming these devices is a challenge. CUDA, OpenCL and other vendor-specific models are definitely a way to go, but these are low-level models that demand excellent...
The shift towards multicore architectures poses significant challenges to the programmers. Unlike programming on single core architectures, multicore architectures require the programmer to decide on how the work needs to be distributed across multiple processors. In this contribution, we analyze the needs of a high-level programming model to program multicore architectures. We use OpenMP as the high-level...
To improve the energy efficiency of parallel ap- plications on GPGPUs, a better understanding of the energy behavior of various applications is mandatory. In this study we employ statis- tical methods to model power and energy con- sumption of some common optimized high per- formance kernels (DGEMM, FFT, PRNG and FD stencils) on a multi-GPU platform.
A dramatic improvement in energy efficiency is mandatory for sustainable supercomputing and has been identified as a major challenge. Affordable energy solution continues to be of great concern in the development of the next generation of supercomputers. Low power processors, dynamic control of processor frequency and heterogeneous systems are being proposed to mitigate energy costs. However, the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.