The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With energy efficiency and power consumption being the primary impediment in the path to exascale systems, low-power high performance embedded systems are of increasing interest. The Parallella System-on-module (SoM) created by Adapteva combines the Epiphany-IV 64-core coprocessor with a host ARM processor housed in a Zynq System-on-chip. The Epiphany integrates low-power RISC cores on a 2D mesh network...
Space is a very important aspect in the simulation of biochemical models, recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and large models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localised fluctuations, transportation phenomena and diffusion. A common drawback...
Multiple-precision integer operations are key components of many security applications; but unfortunately they are computationally expensive on contemporary CPUs. In this paper, we present our design and implementation of a multiple-precision integer library for GPUs which is implemented by CUDA. We report our experimental results which show that a significant speedup can be achieved by GPUs as compared...
Because of the very favorable price to performance ratio of the GPUs, a popular parallel programming configuration today is a cluster of GPUs. However, extracting performance on such a configuration would typically require programming in both MPI and CUDA, thus requiring a high degree of expertise and effort. It is clearly desirable to be able to support higher-level programming of this emerging high-performance...
The computational power of modern graphics processing units (GPUs) has become an interesting alternative in high performance computing. The specialized hardware of GPUs delivers a high degree of parallelism and performance. Various applications in scientific computing have been implemented such that computationally intensive parts are executed on GPUs. In this article, we present a GPU implementation...
Graphics processing units (GPUs) have emerged as a powerful platform for high-performance computation. They have been successfully used to accelerate many scientific workloads. Typically, the computationally intensive parts of the application are offloaded to the GPU, which serves as the CPU's parallel coprocessor. The key to effective utilization of GPUs for scientific computing is the design and...
While modern large-scale computing tasks have grown to span many machines, each with many cores, traditional programming models have not kept up with these advancements, resulting in difficulty exploiting these computing resources with only modest programmer effort. Thalweg seeks to address this breakdown in several ways. It provides a model for designing algorithms that have the potential to scale...
Optimization algorithms are becoming increasingly more important in many areas, such as finance and engineering. Typically, real problems involve several hundreds of variables, and are subject to as many constraints. Several methods have been developed trying to reduce the theoretical time complexity. Nevertheless, when problems exceed reasonable sizes they end up being very computationally intensive...
In this paper, we introduced three prototypes of GPGPU solutions on NVidia GeForce8800GT for a practical Pre-stack Kirchhoff Time Migration program. We presented how to re-design and re-implement the original CPU code to efficiency GPU code. The prototypes are more than at most 7.2 times faster than its CPU version on Intelpsilas P4 3.0G.
This paper discusses how to optimize the digital graphic program with cache system used in GPU/CPU architecture to gain more FPS. Firstly, we introduce the basic principle of cache system summarily; secondly, we discuss the three main organization and mapping technologies of cache system in detail, and then compare these three cache mapping solutions by giving examples; thirdly, illustrate the cache-friendly...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.