Search results

Items from 1 to 8 out of 8 results

chapter

Comparing SpMV for solver applications

Rohit Patel, Vibha Patel, Bhavin Patel

2013 Nirma University International Conference on Engineering (NUiCONE) > 1 - 6

2013 Nirma University International Conference on Engineering (NUiCONE)

In this paper, we propose a new re-ordering technique for improving the performance of Sparse Matrix Vector Multiplication (SpMV) for systems supported with Graphics Processing Units (GPUs). We conducted the test by applying SpMV on solver based applications which are widely used in the domain of engineering and science. We studied and analyzed the existing representations and storage structures of...

chapter

Virtual Systolic Array for QR Decomposition

Jakub Kurzak, Piotr Luszczek, Mark Gates, Ichitaro Yamazaki, more

2013 IEEE 27th International Symposium on Parallel and Distributed Processing > 251 - 260

2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Systolic arrays offer a very attractive, data centric, execution model as an alternative to the von Neumann architecture. Hardware implementations of systolic arrays turned out not to be viable solutions in the past. This article shows how the systolic design principles can be applied to a software solution to deliver an algorithm with unprecedented strong scaling capabilities. Systolic array for...

article

Fast Sparse Level Sets on Graphics Hardware

Andrei C. Jalba, Wladimir J. van der Laan, Jos B.T.M. Roerdink

IEEE Transactions on Visualization and Computer Graphics > 2013 > 19 > 1 > 30 - 44

The level-set method is one of the most popular techniques for capturing and tracking deformable interfaces. Although level sets have demonstrated great potential in visualization and computer graphics applications, such as surface editing and physically based modeling, their use for interactive simulations has been limited due to the high computational demands involved. In this paper, we address...

chapter

Automatic Parallelization of Tiled Loop Nests with Enhanced Fine-Grained Parallelism on GPUs

Peng Di, Ding Ye, Yu Su, Yulei Sui, more

2012 41st International Conference on Parallel Processing > 350 - 359

2012 41st International Conference on Parallel Processing (ICPP)

Automatically parallelizing loop nests into CUDA kernels must exploit the full potential of GPUs to obtain high performance. One state-of-the-art approach makes use of the polyhedral model to extract parallelism from a loop nest by applying a sequence of affine transformations to the loop nest. However, how to automate this process to exploit both intra and inter-SM parallelism for GPUs remains a...

chapter

Accelerating Numerical Linear Algebra Kernels on a Scalable Run Time Reconfigurable Platform

Prasenjit Biswas, Pramod P Udupa, Rajdeep Mondal, Keshavan Varadarajan, more

2010 IEEE Computer Society Annual Symposium on VLSI > 161 - 166

2010 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2010)

Numerical Linear Algebra (NLA) kernels are at the heart of all computational problems. These kernels require hardware acceleration for increased throughput. NLA Solvers for dense and sparse matrices differ in the way the matrices are stored and operated upon although they exhibit similar computational properties. While ASIC solutions for NLA Solvers can deliver high performance, they are not scalable,...

chapter

Accelerating the Nonuniform Fast Fourier Transform Using FPGAs

Srinidhi Kestur, Sungho Park, Kevin M Irick, Vijaykrishnan Narayanan

2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines > 19 - 26

2010 IEEE 18th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM 2010)

We present an FPGA accelerator for the Non-uniform Fast Fourier Transform, which is a technique to reconstruct images from arbitrarily sampled data. We accelerate the compute-intensive interpolation step of the NuFFT Gridding algorithm by implementing it on an FPGA. In order to ensure efficient memory performance, we present a novel FPGA implementation for Geometric Tiling based sorting of the arbitrary...

chapter

Inter-kernel data reuse and pipelining on chip-multiprocessors for multimedia applications

L.A.D. Bathen, Yongjin Ahn, N.D. Dutt, S. Pasricha

2009 IEEE/ACM/IFIP 7th Workshop on Embedded Systems for Real-Time Multimedia > 45 - 54

2009 IEEE/ACM/IFIP 7th Workshop on Embedded Systems for Real-Time Multimedia. ESTIMedia 2009

The increasing demand for low power and high performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency requirements under a tight power budget. As technology scales, it is imperative that applications are optimized to take full advantage of the underlying resources and meet both power and performance requirements. We propose...

chapter

A Lightweight Iterative Compilation Approach for Optimization Parameter Selection

Yonggang Che, Zhenghua Wang

First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'6) > 1 > 318 - 325

First International on Computer and Computational Sciences

A key step in program performance optimization is to determine optimal values for certain parameters. Static approaches determine these values based on analytical models. However, complex computer architectures and complex code structures limit the strength of them. Execution-driven approaches like iterative compilation determine these parameter values by executing the program with different parameter...

Filter options

Data set:
ieee
Keywords:
KERNEL
ARRAYS
TILES

Publication date

Set your own date range

Publication type

book (7)
article (1)

Keywords

COMPUTATIONAL MODELING (2)
GRAPHICS PROCESSING UNIT (2)
HARDWARE (2)
OPTIMIZATION (2)
PARALLEL PROCESSING (2)
SPARSE MATRICES (2)
SYSTOLIC ARRAY (2)
VECTORS (2)
APPLICATION BANDWIDTH (1)
ASIC SOLUTIONS (1)
BEE3 (1)
BEE3 PLATFORM (1)
BENCHMARK TESTING (1)
CG SOLVER (1)
CODE STRUCTURE (1)
CODE TRANSFORMATIONS (1)
COMPUTATIONAL LOAD DISTRIBUTION (1)
COMPUTATIONAL PROPERTY (1)
COMPUTER ARCHITECTURE (1)
COMPUTERS (1)
CONJUGATE GRADIENT (1)
CONJUGATE GRADIENT ALGORITHM (1)
CONJUGATE GRADIENT METHODS (1)
CONVOLUTION (1)
DATA COMPRESSION (1)
DATA MINING (1)
DATA TRANSFER MINIMIZATION (1)
DATA TRANSLATION ARCHITECTURE (1)
DATAFLOW PROGRAMMING (1)
DELAY (1)
DENSE MATRICES (1)
DIRECT SOLVER (1)
DISCRETE WAVELET TRANSFORMS (1)
DYNAMIC COORDINATE-GENERATOR (1)
DYNAMIC POWER REDUCTION (1)
EARLY EXECUTION EDGES (1)
EMBEDDED APPLICATION (1)
EMBEDDED SYSTEMS (1)
EXECUTION-DRIVEN APPROACH (1)
FADDEEV'S ALGORITHM (1)
FAST FOURIER TRANSFORMS (1)
FIELD PROGRAMMABLE GATE ARRAYS (1)
FPGA ACCELERATOR (1)
GENETIC ALGORITHM (1)
GENETIC ALGORITHMS (1)
GEOMETRIC TILING (1)
GEOMETRIC TILING BASED SORTING (1)
GPU (1)
GPUS (1)
GRAPHICS PROCESSING UNITS (1)
HARDWARE ACCELERATION (1)
IMAGE RECONSTRUCTION (1)
INSTRUCTION SETS (1)
INTER-KERNEL DATA REUSE (1)
INTERPOLATION (1)
INTERPOLATION STEP (1)
ITERATIVE SOLVER (1)
JACOBIAN MATRICES (1)
JPEG2000 (1)
KERNEL FUNCTIONS (1)
LATENCY REQUIREMENTS (1)
LEGA (1)
LEVEL SET (1)
LEVEL-SET METHOD (1)
LIGHTWEIGHT ITERATIVE COMPILATION APPROACH (1)
LIMITED EXECUTION (1)
LINEAR ALGEBRA (1)
LOOP PARALLELIZATION (1)
LOOP TILING (1)
MATH KERNEL (1)
MESSAGE PASSING (1)
MICROPROCESSOR CHIPS (1)
MODIFIED FADDEEV'S ALGORITHM (1)
MULTI-CORE (1)
MULTIMEDIA APPLICATIONS (1)
MULTIMEDIA SYSTEMS (1)
MULTIPORT LOCAL MEMORY (1)
MULTIPROCESSOR CHIPS (1)
NLA KERNELS (1)
NLA SOLVERS (1)
NONUNIFORM FAST FOURIER TRANSFORM GRIDDING ALGORITHM (1)
NUFFT (1)
NUMERICAL LINEAR ALGEBRA (1)
NUMERICAL LINEAR ALGEBRA KERNELS (1)
OCTREE (1)
OPTIMISING COMPILERS (1)
OPTIMIZATIONS (1)
PAGE RANK (1)
PARAMETERIZATION (1)
PERFORMANCE REQUIREMENT (1)
PIPELINE PROCESSING (1)
PIPELINES (1)
PIPELINING (1)
PLUG-AND-PLAY KERNEL PIPELINE (1)
POWER AWARE COMPUTING (1)
POWER BUDGET (1)
POWER REQUIREMENT (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options