Search results

Items from 1 to 20 out of 42 results

chapter

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations

Mei Wen, Huayou Su, Wenjie Wei, Nan Wu, more

2012 IEEE International Conference on Cluster Computing > 27 - 35

2012 IEEE International Conference on Cluster Computing (CLUSTER)

In cutting-edge CPU/GPU hybrid clusters, such as Tianhe-1A, the aggregate CPU computing capability may amount to up to 1/3 of the aggregate GPU computing capability. It thus goes without saying that the CPUs and GPUs should jointly carry out the computational work. However, to effectively and simultaneously use both the hardware components requires great care when developing the parallel implementations...

chapter

Performance Comparisons of Parallel Power Flow Solvers on GPU System

Chunhui Guo, Baochen Jiang, Hao Yuan, Zhiqiang Yang, more

2012 IEEE International Conference on Embedded and Real-Time Computing Systems and Applications > 232 - 239

2012 IEEE 18th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2012)

This paper transforms sequential power flow problem to a parallel problem and solves it on GPU. In particular, we implement parallel Gauss-Seidel solver, Newton-Raphson solver, and P-Q decoupled solver using CUDA (Compute Unified Device Architecture) on GPU. The aim is to investigate the performance of the three different parallel power flow solvers. We use four IEEE standard power systems and one...

chapter

Implementation of a Lattice Boltzmann Method for Large Eddy Simulation on Multiple GPUs

Qinjian Li, Chengwen Zhong, Kai Li, Guangyong Zhang, more

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems > 818 - 823

2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)

Recently, the Graphic Processor Unit (GPU) has evolved into a highly parallel, multithreaded, many-core processor with tremendous computational horsepower and very high memory bandwidth. To improve the simulation efficiency of complex flow phenomena in the field of computational fluid dynamics, a CUDA-based simulation algorithm of large eddy simulation using multiple GPUs is proposed. Our implementation...

chapter

Fast calculation of computer-generated holography using multi-graphic processing units

Joongseok Song, Jungsik Park, Jong-Il Park

IEEE international Symposium on Broadband Multimedia Systems and Broadcasting > 1 - 5

2012 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)

A process of generating a digital hologram requires a lot of time-consuming computations. Therefore, it is important to reduce the computation time or the number of computations for achieving real-time digital holographic video generation. In this paper, we propose a method of parallelizing the computations using multiple GPUs with CUDA and OpenMP and an optimization method for reducing the computation...

chapter

Teaching Parallel Programming Models on a Shallow-Water Code

Alexander Breuer, Michael Bader

2012 11th International Symposium on Parallel and Distributed Computing > 301 - 308

2012 11th International Symposium on Parallel and Distributed Computing (ISPDC)

We present a software package that supports teaching different parallel programming models in a computational science and engineering context. It implements a Finite Volume solver for the shallow water equations, with application to tsunami simulation in mind. The numerical model is kept simple, using patches of Cartesian grids as computational domain, which can be connected via ghost layers. The...

chapter

GPU accelerated simulation of the human arterial circulation

Lucian Itu, Sharma Puneet, Ali Kamen, Constantin Suciu, more

2012 13th International Conference on Optimization of Electrical and Electronic Equipment (OPTIM) > 1478 - 1485

2012 13th International Conference on Optimization of Electrical and Electronic Equipment

A GPU accelerated implementation of a reduced-order model of the human arterial circulation is introduced. The computationally intensive tasks of the algorithm (namely, the computation of the flow rate and area values at the interior grid points of the domain) have been migrated to the GPU. The CPU not only coordinates the actions performed by the GPU, but it also computes the inflow, bifurcation...

chapter

A Fast Parallel Implementation of Molecular Dynamics with the Morse Potential on a Heterogeneous Petascale Supercomputer

Qiang Wu, Canqun Yang, Feng Wang, Jingling Xue

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 140 - 149

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Molecular Dynamics (MD) simulations have been widely used in the study of macromolecules. To ensure an acceptable level of statistical accuracy relatively large number of particles are needed, which calls for high performance implementations of MD. These days heterogeneous systems, with their high performance potential, low power consumption, and high price-performance ratio, offer a viable alternative...

chapter

CUDA based Particle Swarm Optimization for geophysical inversion

Debanjan Datta, Suman Mehta, Shalivahan, Ravi Srivastava

2012 1st International Conference on Recent Advances in Information Technology (RAIT) > 416 - 420

2012 1st International Conference on Recent Advances in Information Technology (RAIT)

Many geophysical problems are computationally expensive owing to their iterative nature or due to the programs processing to large datasets. Such problems are challenging and have to be approached with extreme caution because a wrong parameter selection will not only lead to wrong results but will also take up a lot of time. The Compute Unified Device Architecture (CUDA) introduced by NVIDIA has enabled...

chapter

Accelerating Fibre Orientation Estimation from Diffusion Weighted Magnetic Resonance Imaging Using GPUs

Moises Hern´ndez, Gines D. Guerrero, Jose M. Cecilia, Jose M. Garcia, more

2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing > 622 - 626

2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Diffusion Weighted Magnetic Resonance Imaging (DW-MRI) and tractography approaches are the only tools that can be utilized to estimate structural connections between different brain areas, non-invasively and in-vivo. A first step that is commonly utilized in these techniques includes the estimation of the underlying fibre orientations and their uncertainty in each voxel of the image. A popular method...

chapter

Performance evaluation of CPU-GPU and CPU-only algorithms for detecting defective tablets through morphological imaging techniques

Hasan Baig, Jeong-A Lee, Jieun Lee

7th Iberian Conference on Information Systems and Technologies (CISTI 2012) > 1 - 6

2012 7th Iberian Conference on Information Systems and Technologies (CISTI)

Pharmaceutical industries which are intended for the packaging of different tablets in a strip of blister need to make sure that the tablets are free from defects before letting them go into the packing box. The purpose of this project is to speed-up the system process via implementing the image processing algorithm on GPU. Morphological and mathematical operations have been implemented on both GPU...

chapter

A dynamic scheduling framework for emerging heterogeneous systems

Vignesh T. Ravi, Gagan Agrawal

2011 18th International Conference on High Performance Computing > 1 - 10

2011 18th International Conference on High Performance Computing (HiPC)

A trend that has materialized, and has given rise to much attention, is of the increasingly heterogeneous computing platforms. Recently, it has become very common for a desktop or a notebook computer to be equipped with both a multi-core CPU and a GPU. Application development for exploiting the aggregate computing power of such an environment is a major challenge today. Particularly, we need dynamic...

chapter

A multi-GPU algorithm for communication in neuronal network simulations

Raphael Y. de Camargo

2011 18th International Conference on High Performance Computing > 1 - 10

2011 18th International Conference on High Performance Computing (HiPC)

Graphical Processing Units (GPUs) are frequently used for simulations of physical and biological systems. The simulated systems are often composed of simple elements that com municate only with their neighbors. But in some systems, such as large-scale neuronal networks, each element can communicate with any other element in the simulation. In this work, we present an efficient CUDA algorithm that...

chapter

Some recent developments of the DGTD method with practical applications

Stephane Lanteri

2011 Loughborough Antennas & Propagation Conference > 1 - 4

2011 Loughborough Antennas & Propagation Conference (LAPC)

We report on recent developments aiming at improving the accuracy and the performances of a discontinuous Galerkin time domain method (DGTD) for the simulation of time-domain electromagnetic wave propagation problems involving general domains and heterogeneous media. The common objective of the associated studies is to bring the method to a level of computational efficiency and flexibility that allows...

chapter

Accelerating multi-scale flows for LDDKBM diffeomorphic registration

Stefan Sommer

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) > 499 - 505

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)

Registrations in medical imaging and computational anatomy can be obtained using the Large Deformation Diffeomorphic Kernel Bundle Mapping (LDDKBM) framework. This provides a registration algorithm with a solid mathematical foundation while incorporating regularization of deformation at multiple scales. Because the variational formulation of LDDKBM implies a heavy computational burden in the search...

chapter

Variational Depth from Defocus in real-time

Rami Ben-Ari, Gonen Raveh

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) > 522 - 529

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)

With emerging of next generation of digital cameras offering a 3D reconstruction of a viewed scene, Depth from Defocus (DFD) presents an attractive option. In this approach the depth profile of the scene is recovered from two views captured in different focus setting. The DFD is well known as a computationally-intensive method due to the shift-variant filtering involved with its estimation. In this...

chapter

4D simulation of nonlinear pressure field propagation on GPU with the angular spectrum method

Francois Varray, Christian Cachard, Piero Tortoli, Olivier Basset

2011 IEEE International Ultrasonics Symposium > 250 - 253

2011 IEEE International Ultrasonics Symposium (IUS)

The simulation of nonlinear propagation of ultrasound wave in biological tissue is a very time consuming operation. Different simulators, based on finite difference or angular spectrum methods have been reported in the literature and the second one provide faster simulations by considering separately the different harmonics. In this paper we proposed to use a generalized angular spectrum method (GASM)...

chapter

Multiphase LBM Distributed over Multiple GPUs

Carlos Rosales

2011 IEEE International Conference on Cluster Computing > 1 - 7

2011 IEEE International Conference on Cluster Computing (CLUSTER)

A parallel distributed CUDA implementation of a Lattice Boltzmann Method for multiphase flows with large density ratios is described in this paper. Validation runs studying the terminal velocity of a rising bubble under the effect of gravity show good agreement with the expected theoretical values. The code is benchmarked against the performance of a typical CPU implementation of the same algorithm...

chapter

Parallel grid-based method and belief fusion — Real-time cooperative non-Gaussian estimation

Tomonari Furukawa, Xianqiao Tong, Gamini Dissanayake, Hugh F. Durrant-Whyte

2011 6th International Conference on Industrial and Information Systems > 370 - 375

2011 IEEE 6th International Conference on Industrial and Information Systems (ICIIS)

This paper presents a parallel grid-based method and belief fusion for real-time cooperative Bayesian estimation. The grid-based recursive Bayesian estimation (RBE) method effectively maintains the belief of objects even with no detection event but requires large computation for its prediction and correction processes as well as fusion process in cooperative estimation. In order for real-time estimation,...

chapter

CNN based high performance computing for real time image processing on GPU

Sasanka Potluri, Alireza Fasih, Laxminand Kishore Vutukuru, Fadi Al Machot, more

Proceedings of the Joint INDS'11 & ISTET'11 > 1 - 7

2011 Joint 3rd Int'l Workshop on Nonlinear Dynamics and Synchronization (INDS) & 16th Int'l Symposium on Theoretical Electrical Engineering (ISTET)

Many of the basic image processing tasks suffer from processing overhead to operate over the whole image. In real time applications the processing time is considered as a big obstacle for its implementations. A High Performance Computing (HPC) platform is necessary in order to solve this problem. The usage of hardware accelerator make the processing time low. In recent developments, the Graphics Processing...

chapter

Python for Development of OpenMP and CUDA Kernels for Multidimensional Data

Bogdan Vacaliuc, Dilip R. Patlolla, Ed. D'Azevedo, Greg G. Davidson, more

2011 Symposium on Application Accelerators in High-Performance Computing > 159 - 167

2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC)

Design of data structures for high performance computing (HPC) is one of the principal challenges facing researchers looking to utilize heterogeneous computing machinery. Heterogeneous systems derive cost, power, and speed efficiency by being composed of the appropriate hardware for the task. Yet, each type of processor requires a specific organization of the application state in order to achieve...

Keywords:
MATHEMATICAL MODEL
KERNEL

Publication date

Set your own date range

Publication type

book (40)
article (2)

Keywords

COMPUTATIONAL MODELING (23)
INSTRUCTION SETS (19)
EQUATIONS (16)
CUDA (12)
GPU (9)
COPROCESSORS (8)
COMPUTER GRAPHIC EQUIPMENT (7)
GRAPHICS PROCESSING UNITS (5)
PERFORMANCE EVALUATION (5)
CENTRAL PROCESSING UNIT (4)
HARDWARE (4)
NUMERICAL MODELS (4)
ARRAYS (3)
COMPUTER ARCHITECTURE (3)
GPGPU (3)
HIGH PERFORMANCE COMPUTING (3)
PARALLEL ARCHITECTURES (3)
BIOLOGICAL SYSTEM MODELING (2)
CONVOLUTION (2)
FINITE DIFFERENCE METHODS (2)
GRAPHICS (2)
GRAPHICS PROCESSING UNIT (GPU) (2)
IMAGE PROCESSING (2)
IMPEDANCE (2)
INDEXES (2)
MEMORY MANAGEMENT (2)
MOMENT METHODS (2)
PARALLEL (2)
REAL TIME SYSTEMS (2)
TILES (2)
16 AMD PHENOM 9650 QUAD-CORE 2.4GHZ CPU (1)
2D WAVE PROPAGATION (1)
8 TESLA C1060 GPU (1)
8 VIRTEX-5 XC5VLX330T FPGA (1)
ACCELERATION (1)
ACCURACY (1)
ACOUSTIC WAVE PROPAGATION (1)
ACOUSTIC WAVE SCATTERING (1)
ACOUSTIC WAVES (1)
ACOUSTICS (1)
ADAPTATION MODELS (1)
ADMITTANCE (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ALGORITHMS (1)
AMR (1)
ANOMALOUS DIFFUSION SIMULATION PROCESS (1)
APPROXIMATION METHODS (1)
ASSET SIMULATION (1)
AUTOREGRESSIVE PROCESSES (1)
BALL AND STICK (1)
BAYESIAN INFERENCE (1)
BENCHMARK TESTING (1)
BENCHMARKS (1)
BIODIFFUSION (1)
BIOINFORMATICS (1)
BIOLOGICAL NEURAL NETWORKS (1)
BIOLOGY COMPUTING (1)
BIOMEDICAL MRI (1)
BOSE-EINSTEIN CONDENSATES (1)
BRAIN MODELING (1)
BRAIN MODELS (1)
BROADBAND ANALYSIS (1)
CAVITY RESONATORS (1)
CELLULAR NEURAL NETWORKS (1)
CFD (1)
CHEMISTRY COMPUTING (1)
COMPUTATIONAL ELECTROMAGNETICS (1)
COMPUTATIONAL EPIDEMIOLOGY (1)
COMPUTATIONAL FLUID DYNAMICS (1)
COMPUTE UNIFIED DEVICE ARCHITECTURE (1)
COMPUTER GRAPHICS (1)
COMPUTERS (1)
COMPUTING (1)
CONTAGION DIFFUSION (1)
COPLANAR WAVEGUIDES (1)
CPU (1)
CPU BOUNDED SIMULATIONS (1)
CPU ENVIRONMENT (1)
CPU-GPU HYBRID COMPUTING (1)
CSE EDUCATION (1)
CUDA PROGRAMMING MODEL (1)
CYBER-PHYSICAL SYSTEMS (1)
DESKTOP SYSTEMS (1)
DIFFERENTIAL EQUATIONS (1)
DIFFUSION (1)
DIFFUSION WEIGHTED IMAGING (1)
DIGITAL SIMULATION (1)
DISEASES (1)
DISTRIBUTED (1)
DUAL-LITHOLOGY SEDIMENTARY BASIN SIMULATION (1)
DYNAMIC PROGRAMMING (1)
DYNAMIC SCHEDULING (1)
DYNAMIC SCHEDULING MONTE-CARLO SIMULATION FRAMEWORK (1)
DYNAMIC VOLTAGE AND FREQUENCY SCALING (1)
DYNAMICS (1)
EDUCATION (1)
EFFICIENT ALLOCATION LINE (1)
more

INFONA - science communication portal

Search results

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations

Performance Comparisons of Parallel Power Flow Solvers on GPU System

Implementation of a Lattice Boltzmann Method for Large Eddy Simulation on Multiple GPUs

Fast calculation of computer-generated holography using multi-graphic processing units

Teaching Parallel Programming Models on a Shallow-Water Code

GPU accelerated simulation of the human arterial circulation

A Fast Parallel Implementation of Molecular Dynamics with the Morse Potential on a Heterogeneous Petascale Supercomputer

CUDA based Particle Swarm Optimization for geophysical inversion

Accelerating Fibre Orientation Estimation from Diffusion Weighted Magnetic Resonance Imaging Using GPUs

Performance evaluation of CPU-GPU and CPU-only algorithms for detecting defective tablets through morphological imaging techniques

A dynamic scheduling framework for emerging heterogeneous systems

A multi-GPU algorithm for communication in neuronal network simulations

Some recent developments of the DGTD method with practical applications

Accelerating multi-scale flows for LDDKBM diffeomorphic registration

Variational Depth from Defocus in real-time

4D simulation of nonlinear pressure field propagation on GPU with the angular spectrum method

Multiphase LBM Distributed over Multiple GPUs

Parallel grid-based method and belief fusion — Real-time cooperative non-Gaussian estimation

CNN based high performance computing for real time image processing on GPU

Python for Development of OpenMP and CUDA Kernels for Multidimensional Data

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options