The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A comparison of PGI Open ACC, FORTRAN CUDA, and Nvidia CUDA pseudospectral methods on a single GPU and GCC FORTRAN on single and multiple CPU cores is reported. The GPU implementations use CuFFT and the CPU implementations use FFTW. Porting pre-existing FORTRAN codes to utilize a GPUs is efficient and easy to implement with Open ACC and CUDA FORTRAN. Example programs are provided.
In this paper, we first discussed the video decoding standard and its architecture, and then analyzed the decoding complexity of each process. By using the benefit of the CUDA programming model, and taking advantages of GPU to optimize the decoding process of MC (motion compensation) and CSC(color space conversion) that are very time consuming, we proposed a MC accelerating method based on CUDA, and...
Parallelization of VLSI routing algorithms is one of the challenging problems in VLSI physical design. This is due to a large number of nets as well as the shared routing resources that result in data dependency among concurrent tasks. In this paper, VLSI Maze routing using GPGPU has been proposed to enable runtime performance improvement. We report up to 3× performance gain with an average of 25%...
IT is a kind of science which develops fast. Nowadays the GPGPU (General Purpose Graphic Processing Unit) based technology is gaining on the importance. Few years ago, GPU was designed only to process the graphic data. The GPGPU enables computation on all kinds of data. This innovation brings computing performance huge boost. The GPGPU have been developed since year 2003. The technology was not available...
In this paper we develop a parallelized implementation of the anisotropic diffusion image preprocessing algorithm for illumination invariant face recognition proposed by Gross and Brajovic. Our implementation employs Red-Black Gauss-Seidel relaxation running on inexpensive Graphics Processing Units (GPUs) programmed with Nvidia's CUDA framework. We are able to achieve a 20X speedup over a multithreaded...
An idea of the use of two accumulators for improvement of the precision of floating-point computations with graphic processing units (GPUs) is presented in this paper for applications in digital signal processing. The increase of the precision of computations does not need any increase of the length of the data words. This is particularly important if hardware limits for the precision of computations...
The idea of distributed computer emulation is presented within this paper. Since classic emulation techniques put the power load on the host CPU only, the new approach tries to distribute the load among other available processors within the host platform. The implementation uses OpenCL framework. This standard allows writing high parallel and portable programs in ISO C99 subset language, runnable...
Recently, GPGPU has been adopted well in the High Performance Computing (HPC) field. The limited global memory bandwidth poses a great challenge to many GPGPU programmers trying to exploit parallelism within the CPU-GPU heterogeneous platform. In this paper, we choose SWIM, a typical memory intensive application from the SPEC OMP 2001 benchmark suite, for case study. We attempt to optimize the performance...
MapReduce is a programming model that enables efficient massive data processing in large-scale computing environments such as supercomputers and clouds. Such large-scale computers employ GPUs to enjoy its good peak performance and high memory bandwidth. Since the performance of each job is depending on running application characteristics and underlying computing environments, scheduling MapReduce...
General-purpose computing on graphics processing units (GPGPU) is popular computing technology to utilize in various fields. In the paper, we parallelize cryptographical hash processing of a password cracking tool, John the Ripper, by utilizing CUDA on GPGPU. We also evaluate our work to compare the processing time of hash processing parallelized by GPU with that of the John the Ripper on a dual-core...
Graphics Processing Units (GPU) have been the extensive research topic in recent years and have been successfully applied to general purpose applications other than computer graphical area. The nVidia CUDA programming model provides a straightforward means of describing inherently parallel computations. In this paper, we present a study of the efficiency of emerging technology in applying General...
The statistics of disease clustering is of interest to epidemiologists. In order to detect spatial clustering of disease in all the regions of China, we adopted a likelihood ratio based method which utilizes Monte Carlo simulation and spatial exploring to analyze the real time updating data stored in database. However, large number of random tests for Monte Carlo simulation and large scale of the...
In this paper, several methods of optimizing parallel implementation of 2D FDTD algorithm are presented. Some practical problems occurring in real simulations are taken into consideration. Moreover, the presented methods are supported with appropriate tests and practical examples.
A GPGPU-based collision detection algorithm is proposed. Firstly, the information of OBB hierarchy tree and triangles of tested objects are mapped into some data textures designed for GPGPU-based calculation, such as triangle vertex textures, bounding box size texture, tree node relationship texture, etc., then these textures are downloaded to GPU to complete the data preparation. Secondly, the whole...
In this paper, we present a comparison study about implementations of phase correlation function using GPUs, ASIC and FPGAs. The Phase Only Correlation(POC) method demonstrates high robustness and subpixel accuracy in the pattern matching and the image registration. However, there is a disadvantage in computational speed because of the calculation of 2D-FFT etc. We have proposed a novel approach to...
This paper reports on our experience with data structure design for systems having both multiple CPU cores and a programmable graphics card. We integrate our data structures into the game-like application OpenSteerDemo and compare our data structures on two pc-systems. One System has a relative fast single core CPU and slower GPU, whereas the other one uses a high-end GPU with a slower multi core...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.