Search results

Items from 1 to 15 out of 15 results

chapter

Implementation of Motion Estimation Based on Heterogeneous Parallel Computing System with OpenCL

Jinglin Zhang, Jean-Francois Nezan, Jean-Gabriel Cousin

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems > 41 - 45

2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)

Heterogeneous computing system increases the performance of parallel computing in many domain of general purpose computing with CPU, GPU and other accelerators. Open Computing Language(OpenCL) is the first open, royalty-free standard for heterogenous computing on multi hardware platforms. In this paper, we propose a parallel Motion Estimation(ME) algorithm implemented using OpenCL and present several...

article

A GPU Task-Parallel Model with Dependency Resolution

Stanley Tzeng, Brandon Lloyd, John D. Owens

Computer > 2012 > 45 > 8 > 34 - 41

A task-parallel approach to programming commodity graphics hardware is useful for implementing irregular parallel workloads with dependencies, particularly for applications such as video encoding and backtracking algorithms. The featured Web extra is a video that demonstrates how to use a GPU task-parallel model for H.264 intra prediction. The authors first describe the dependency structure and then...

chapter

Massive Jacobi power flow based on SIMD-processor

C Vilacha, J C Moreira, E Miguez, A F Otero

2011 10th International Conference on Environment and Electrical Engineering > 1 - 4

2011 10th International Conference on Environment and Electrical Engineering (EEEIC)

This paper presents an implementation of the Jacobi power flow algorithm to be run on a single instruction multiple data (SIMD) unit processor. The purpose is to be able to solve a large number of power flows in parallel as quickly as possible. This well-known algorithm was modified taking into account the characteristics of the SIMD architecture. The results show a significant speed-up of the algorithm...

chapter

Acceleration of simulation models for raw materials thermal treatment

Dusan Nascak, Imrich Kostial, Jan Mikula, Andrej Olijar, more

2011 12th International Carpathian Control Conference (ICCC) > 203 - 208

2011 12th International Carpathian Control Conference (ICCC)

Parallel data processing belongs in the present time to the basic approaches. Its realization is possible by using of multi-core processors or we can use new trend with graphic accelerators on the new type of graphic cards. However this process is not straightforward and it requires an adequate model structure and application program parallelization. Adequate model structure includes not only parallelization...

chapter

Parallel Processing of DCT on GPU

S Tokdemir, S Belkasim

2011 Data Compression Conference > 479

2011 Data Compression Conference (DCC)

In this paper the implementation of discrete cosine transform (DCT) on the GPU. The study indicates a clear superiority of the GPU as parallel processor for image compression using DCT over the CPU. It also indicates that the increase in image size considerably slowed the CPU and did not affect the GPU.

article

Enhancing the Performance of Conjugate Gradient Solvers on Graphic Processing Units

M M Dehnavi, David M Fernández, D Giannacopoulos

IEEE Transactions on Magnetics > 2011 > 47 > 5 > 1162 - 1165

A study of the fundamental obstacles to accelerate the preconditioned conjugate gradient (PCG) method on modern graphic processing units (GPUs) is presented and several techniques are proposed to enhance its performance over previous work independent of the GPU generation and the matrix sparsity pattern. The proposed enhancements increase the performance of PCG up to 23 times compared to vector optimized...

chapter

GEDS: GPU Execution of Continuous Queries on Spatio-Temporal Data Streams

J Cazalas, R Guha

2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing > 112 - 119

2010 IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing (EUC 2010)

Much research exists for the efficient processing of spatio-temporal data streams. However, all methods ultimately rely on an ill-equipped processor, namely a CPU, to evaluate concurrent, continuous spatio-temporal queries over these data streams. This paper presents GEDS, a scalable, Graphics Processing Unit (GPU)-based framework for the evaluation of continuous spatio-temporal queries over spatio-temporal...

chapter

Using GPUs to improve system performance in visual servo systems

Chuantao Zang, Koichi Hashimoto

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems > 3937 - 3942

2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010)

This paper describes our novel work of using GPUs to improve the performance of a homography-based visual servo system. We present our novel implementations of a GPU based Efficient Second-order Minimization (GPU-ESM) algorithm. By utilizing the tremendous parallel processing capability of a GPU, we have obtained significant acceleration over its CPU counterpart. Currently our GPU-ESM algorithm can...

chapter

Mixed-Tool Performance Analysis on Hybrid Multicore Architectures

Peng Du, Piotr Luszczek, Stanimire Tomov, Jack Dongarra

2010 39th International Conference on Parallel Processing Workshops > 236 - 244

2010 39th International Conference on Parallel Processing Workshops (ICPPW)

This paper proposes a triangular solve algorithm with variable block size for graphics processing unit (GPU). By using diagonal blocks inversion with recursion, this algorithm works with tunable block size to achieve the best performance. Various methods are shown on how to make use of existing profiling tools to successfully measure and analyze performance of this algorithm. We use some of the most...

chapter

A multi-GPU implementation of a Cellular Genetic Algorithm

Pablo Vidal, Enrique Alba

IEEE Congress on Evolutionary Computation > 1 - 7

2010 IEEE Congress on Evolutionary Computation

In this paper, we present a novel implementation of a Cellular Genetic Algorithm (cGA) model for a multi-GPU platform using NVIDIA's CUDA technology. This multi-GPU cGA model is compared first against a serial version in CPU and then versus an implementation on a single GPU. We divide the different operations of the cGA into distinct sets of instructions called kernels. Using the multi-GPU platform...

chapter

Real-time Semi-Global Matching on the CPU

Stefan K Gehrig, Clemens Rabe

2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops > 85 - 92

2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops)

Among the top-performing stereo algorithms on the Middlebury Stereo Database, Semi-Global Matching (SGM) is commonly regarded as the most efficient algorithm. Consequently, real-time implementations of the algorithm for graphics hardware (GPU) and reconfigurable hardware (FPGA) exist. However, the computation time on general purpose PCs is still more than a second. In this paper, a real-time SGM implementation...

chapter

Fast GPU implementation of large scale dictionary and sparse representation based vision problems

Pradeep Nagesh, Rahul Gowda, Baoxin Li

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 1570 - 1573

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

Recently, Computer Vision problems like Face Recognition and Super-Resolution solved using sparse representation based methods with large dictionaries have shown state-of-the-art results. However such methods are computationally prohibitive for typical CPUs, especially for a large dictionary size. We present fast implementation of these methods by exploiting the massively parallel processing capabilities...

chapter

Implementation of association rule mining using CUDA

S.H. Adil, S. Qamar

2009 International Conference on Emerging Technologies > 332 - 336

2009 International Conference on Emerging Technologies (ICET)

The purpose of this paper is to implement association rule mining algorithm using Nvidia CUDA framework for general purpose computing on GPU. The major objective is to perform performance comparison of association rule mining algorithm using C based implementation on Intel Quad Core/Core2 Duo CPU with CUDA based implementation on Nvidia G80 and GTX 200 series GPU. The final outcome of this research...

chapter

Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture

G. Poli, J.H. Saito, J.F. Mari, M.R. Zorzan

2008 20th International Symposium on Computer Architecture and High Performance Computing > 81 - 88

2008 20th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

This work presents an implementation of neocognitron neural network, using a high performance computing architecture based on GPU (graphics processing unit). Neocognitron is an artificial neural network, proposed by Fukushima and collaborators, constituted of several hierarchical stages of neuron layers, organized in two-dimensional matrices called cellular planes. For the high performance computation...

chapter

Automatic Dynamic Task Distribution between CPU and GPU for Real-Time Systems

M. Joselli, M. Zamith, E. Clua, A. Montenegro, more

2008 11th IEEE International Conference on Computational Science and Engineering > 48 - 55

2008 IEEE 11th International Conference on Computational Science and Engineering (CSE)

The increase of computational power of programmable GPU (graphics processing unit) brings new concepts for using these devices for generic processing. Hence, with the use of the CPU and the GPU for data processing come new ideas that deals with distribution of tasks among CPU and GPU, such as automatic distribution. The importance of the automatic distribution of tasks between CPU and GPU lies in...

Filter options

Keywords:
CPU
PARALLEL PROCESSING

Publication date

Set your own date range

Publication type

book (13)
article (2)

Keywords

GPU (10)
COPROCESSORS (9)
INSTRUCTION SETS (6)
KERNEL (6)
COMPUTER ARCHITECTURE (5)
COMPUTER GRAPHIC EQUIPMENT (5)
CUDA (3)
OPTIMIZATION (3)
ACCELERATION (2)
ALGORITHM DESIGN AND ANALYSIS (2)
CENTRAL PROCESSING UNIT (2)
COMPUTATIONAL MODELING (2)
COMPUTE UNIFIED DEVICE ARCHITECTURE (2)
COMPUTER GRAPHICS (2)
DATABASES (2)
FACE RECOGNITION (2)
HARDWARE (2)
IMAGE RESOLUTION (2)
NVIDIA (2)
PARALLEL COMPUTING (2)
PIXEL (2)
REAL TIME SYSTEMS (2)
2D MATRICES (1)
APPLICATION PROGRAM INTERFACES (1)
APPLICATION PROGRAMMING INTERFACE (1)
ARRAYS (1)
ARTIFICIAL NEURAL NETWORK (1)
ASSOCIATION RULE MINING (1)
ASSOCIATION RULES (1)
AUTOMATIC DISTRIBUTION (1)
AUTOMATIC DYNAMIC TASK DISTRIBUTION (1)
BANDWIDTH (1)
BIOLOGICAL SYSTEM MODELING (1)
C BASED IMPLEMENTATION (1)
C LANGUAGE (1)
CELLULAR GENETIC ALGORITHM (1)
CELLULAR PLANES (1)
CLUSTERING ALGORITHMS (1)
CMU-PIE DATABASE (1)
COMPUTATION SHARING (1)
COMPUTER VISION (1)
CONCURRENT COMPUTING (1)
CONJUGATE GRADIENT METHODS (1)
CONJUGATE GRADIENTS (CGS) (1)
CONTEXT (1)
CONTINUOUS QUERY (1)
CPU/GPU IMAGE COMPRESSION (1)
DATA COMPRESSION (1)
DATA DEPENDENCY (1)
DATA MINING (1)
DATA MODELS (1)
DCT (1)
DCT ON GPU (1)
DIAGONAL BLOCKS INVERSION (1)
DICTIONARIES (1)
DICTIONARY (1)
DISCRETE COSINE TRANSFORM (1)
DISCRETE COSINE TRANSFORMS (1)
DRIVER ASSISTANCE SYSTEMS (1)
EFFICIENT SECOND-ORDER MINIMIZATION (1)
ENCODING (1)
ESM ALGORITHM (1)
FACE IMAGE DATABASES (1)
FIELD PROGRAMMABLE GATE ARRAYS (1)
FUNDAMENTAL OBSTACLE (1)
GEDS (1)
GEFORCE 8800 GTX (1)
GENERAL PURPOSE COMPUTING (1)
GENERIC PROCESSING (1)
GENETIC ALGORITHMS (1)
GPGPU (1)
GPU CONTINUOUS QUERIES EXECUTION (1)
GPU GENERATION (1)
GPU PROFILING TOOLS (1)
GPU-BASED COMPUTING (1)
GPU. HPC (1)
GRAPHIC HARDWARE (1)
GRAPHIC PROCESSING UNIT (1)
GRAPHIC PROCESSING UNITS (GPUS) (1)
GRAPHIC PROCESSOR UNIT (1)
GRAPHICAL PROCESSING UNIT (1)
GRAPHICS PROCESSORS (1)
HEATING (1)
HETEROGENEOUS (1)
HIGH PERFORMANCE COMPUTING ARCHITECTURE (1)
HOMOGRAPHY (1)
HYBRID MULTICORE ARCHITECTURES (1)
IEEE STANDARDS (1)
IEEE-118 STANDARD NETWORK (1)
ILL-EQUIPPED PROCESSOR (1)
IMAGE CODING (1)
IMAGE COMPRESSION (1)
IMAGE MATCHING (1)
IMAGE REPRESENTATION (1)
IMAGE SAMPLING (1)
IMAGE SIZE (1)
IMAGE SUBSAMPLING (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options