Search results

Items from 1 to 11 out of 11 results

chapter

Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs

Peng Di, Ding Ye, Yu Su, Yulei Sui, more

2012 41st International Conference on Parallel Processing > 350 - 359

2012 41st International Conference on Parallel Processing (ICPP)

Automatically parallelizing loop nests into CUDA kernels must exploit the full potential of GPUs to obtain high performance. One state-of-the-art approach makes use of the polyhedral model to extract parallelism from a loop nest by applying a sequence of affine transformations to the loop nest. However, how to automate this process to exploit both intra and inter-SM parallelism for GPUs remains a...

chapter

A novel GPU implementation of eigenanalysis for risk management

Mustafa U. Torun, Ali N. Akansu

2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) > 490 - 494

2012 IEEE 13th Workshop on Signal Processing Advances in Wireless Communications (SPAWC 2012)

Portfolio risk is commonly defined as the standard deviation of its return. The empirical correlation matrix of asset returns in a portfolio has its intrinsic noise component. This noise is filtered for more robust performance. Eigendecomposition is a widely used method for noise filtering. Jacobi algorithm has been a popular eigensolver technique due to its stability. We present an efficient GPU...

chapter

Automatic Resource Scheduling with Latency Hiding for Parallel Stencil Applications on GPGPU Clusters

Kumiko Maeda, Masana Murase, Munehiro Doi, Hideaki Komatsu, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium > 544 - 556

2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Overlapping computations and communication is a key to accelerating stencil applications on parallel computers, especially for GPU clusters. However, such programming is a time-consuming part of the stencil application development. To address this problem, we developed an automatic code generation tool to produce a parallel stencil application with latency hiding automatically from its dataflow model...

chapter

Evaluating Polynomials in Several Variables and their Derivatives on a GPU Computing Processor

Jan Verschelde, Genady Yoffe

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 1397 - 1405

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In order to obtain more accurate solutions of polynomial systems with numerical continuation methods we use multiprecision arithmetic. Our goal is to offset the overhead of double double arithmetic accelerating the path trackers and in particular Newton's method with a general purpose graphics processing unit. In this paper we describe algorithms for the massively parallel evaluation and differentiation...

chapter

Gauss-Newton image registration with CUDA

Manal Jalloul, Mohammed Baydoun, Mohamad Adnan Al-Alaoui

2011 18th IEEE International Conference on Electronics, Circuits, and Systems > 305 - 309

2011 18th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2011)

Image registration is the process of matching different images whether 2D or 3D of certain similar or common properties for different purposes. This work addresses this field using a Gauss-Newton optimization approach. The problem is basically formulated as minimizing a cost function that is then solved by a backtracking line search. Since this is considered as a demanding problem especially for larger...

chapter

Optimizing Algorithm of Sparse Linear Systems on GPU

Dongxu Yan, Haijun Cao, Xiaoshe Dong, Bao Zhang, more

2011 Sixth Annual Chinagrid Conference > 174 - 179

2011 Sixth Chinagrid Annual Conference (ChinaGrid)

Linear equations with large spare coefficient matrices arise in many practical scientific and engineering problems. Previous sparse matrix algorithms for solving linear equations based on single-core CPU are highly complex and time-consuming. To solve such problems, aiming at Jacobi iteration algorithm, in this paper we firstly implement a sparse matrix parallel iteration algorithm on a hybrid multi-core...

chapter

Massive Jacobi power flow based on SIMD-processor

C Vilacha, J C Moreira, E Miguez, A F Otero

2011 10th International Conference on Environment and Electrical Engineering > 1 - 4

2011 10th International Conference on Environment and Electrical Engineering (EEEIC)

This paper presents an implementation of the Jacobi power flow algorithm to be run on a single instruction multiple data (SIMD) unit processor. The purpose is to be able to solve a large number of power flows in parallel as quickly as possible. This well-known algorithm was modified taking into account the characteristics of the SIMD architecture. The results show a significant speed-up of the algorithm...

chapter

Pricing multi-asset American options on Graphics Processing Units using a PDE approach

Duy Minh Dang, Christina C. Christara, Kenneth R. Jackson

2010 IEEE Workshop on High Performance Computational Finance > 1 - 8

2010 Workshop on High Performance Computational Finance at SC10 (WHPCF)

We develop highly efficient parallel pricing methods on Graphics Processing Units (GPUs) for multi-asset American options via a Partial Differential Equation (PDE) approach. The linear complementarity problem arising due to the free boundary is handled by a penalty method. Finite difference methods on uniform grids are considered for the space discretization of the PDE, while classical finite differences,...

chapter

Design and Implementation of Jacobi Algorithms on GPU

Jinxian Lin, Ying Chen

2010 International Conference on Artificial Intelligence and Computational Intelligence > 1 > 448 - 450

2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2010)

With the development of GPU, the GPU's float-point computing capacity improves rapidly. How to apply the float-point ability of GPU to the non-graphic computing field becomes a highlight in the research of high performance computing. Jacobi is a typical application in scientific computing. This paper designs and implements Jacobi Algorithm on Nvidia's CUDA platform and gets a good speedup compared...

chapter

Parallel power flow solutions using a biconjugate gradient algorithm and a Newton method: A GPU-based approach

N Garcia

IEEE PES General Meeting > 1 - 4

2010 IEEE Power & Energy Society General Meeting

A new approach to solve the power flow problem based on graphic processing units is presented in this paper. A Newton method is implemented to solve the set of nonlinear equations of the power flow formulation. A parallel kernel for the biconjugate gradient method allows solving the voltage corrections on a graphic processing card. While the evaluation of the Jacobian matrix is carried out on the...

chapter

Higher order FEM numerical integration on GPUs with OpenCL

Przemysław Plaszewski, Krzysztof Banaś, Paweł Macioł

Proceedings of the International Multiconference on Computer Science and Information Technology > 337 - 342

2010 International Multiconference on Computer Science and Information Technology (IMCSIT 2010)

Paper presents results obtained when porting FEM 2D linear elastostatic local stiffness matrix calculations to Tesla architecture with OpenCL framework. Comparison with native NVIDIA CUDA implementations has been provided.

Filter options

Content availability:
Available
Data set:
ieee
Keywords:
KERNEL
GRAPHICS PROCESSING UNIT
JACOBIAN MATRICES

Publication date

Set your own date range

Keywords

INSTRUCTION SETS (6)
GPU (5)
COMPUTER GRAPHIC EQUIPMENT (3)
PARALLEL PROCESSING (3)
VECTORS (3)
ALGORITHM DESIGN AND ANALYSIS (2)
APPROXIMATION METHODS (2)
ARRAYS (2)
COPROCESSORS (2)
GRAPHICS (2)
LOAD FLOW (2)
OPTIMIZATION (2)
POWER ENGINEERING COMPUTING (2)
SPARSE MATRICES (2)
TILES (2)
2D LINEAR ELASTOSTATIC LOCAL STIFFNESS MATRIX (1)
ALGORITMIC DIFFERENTIATION (1)
ALTERNATING DIRECTION IMPLICIT APPROXIMATE FACTORIZATION (1)
AMERICAN OPTION (1)
BICONJUGATE GRADIENT ALGORITHM (1)
BICONJUGATE GRADIENT METHOD (1)
COMPUTATIONAL MODELING (1)
COMPUTE UNIFIED DEVICE ARCHITECTURE (CUDA) (1)
COMPUTER ARCHITECTURE (1)
CORRELATION (1)
CPU (1)
CSR (1)
CUDA (1)
EIGEN DECOMPOSITION (1)
ENCODING (1)
EQUATIONS (1)
FEM NUMERICAL INTEGRATION (1)
FINITE DIFFERENCE (1)
FINITE ELEMENT ANALYSIS (1)
FINITE ELEMENT METHODS (1)
FLOAT-POINT COMPUTING (1)
FLOATING POINT ARITHMETIC (1)
GEOMETRY (1)
GPU-BASED APPROACH (1)
GPUS (1)
GRADIENT METHODS (1)
GRAPHIC PROCESSING UNITS (1)
GRAPHICS PROCESSING UNIT (GPU) (1)
GRAPHICS PROCESSING UNITS (1)
HARDWARE (1)
HIGH PERFORMANCE COMPUTING (1)
HPC (1)
IEEE STANDARDS (1)
IEEE-118 NODE SYSTEM (1)
IEEE-118 STANDARD NETWORK (1)
IMAGE REGISTRATION (1)
INTEGRATION (1)
INTERPOLATION (1)
JACOBI ALGORITHM (1)
JACOBI ALGORITHMS (1)
JACOBI ITERATION (1)
JACOBI POWER FLOW ALGORITHM (1)
LATENCY HIDING (1)
LINEAR ALGEBRA (1)
LINEAR SYSTEM (1)
LOOP PARALLELIZATION (1)
LOOP TILING (1)
MASSIVELY PARALLEL POLYNOMIAL EVALUATION (1)
MATRIX ALGEBRA (1)
MULTI-ASSET (1)
NETWORK EMBEDDING (1)
NEWTON METHOD (1)
NEWTON-RAPHSON ALGORITHM (1)
NOISE (1)
NONLINEAR EQUATIONS (1)
NVIDIA CUDA (1)
NVIDIA CUDA PLATFORM (1)
OPENCL (1)
PARALLEL ARCHITECTURES (1)
PARALLEL COMPUTING (1)
PARALLEL POWER FLOW SOLUTIONS (1)
PEER TO PEER COMPUTING (1)
PENALTY METHOD (1)
POLYNOMIALS (1)
PORTFOLIO RISK (1)
PORTFOLIOS (1)
POWER FLOW (1)
PRICING (1)
RESOURCE SCHEDULING (1)
RIVERS (1)
SHAPE (1)
SIMD-PROCESSOR (1)
SINGLE INSTRUCTION MULTIPLE DATA UNIT PROCESSOR (1)
SPARSE LINEAR SYSTEMS (1)
SPEELPENNING PRODUCT (1)
STENCIL COMPUTATIONS (1)
SYMMETRIC MATRICES (1)
SYNCHRONIZATION (1)
TESLA ARCHITECTURE (1)
VOLTAGE CORRECTIONS (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options