Search results

Items from 1 to 11 out of 11 results

chapter

Fast Linear Algebra on GPU

Lukas Polok, Pavel Smrz

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems > 439 - 444

2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)

GPUs have been successfully used for acceleration of many mathematical functions and libraries. A common limitation of those libraries is a minimal size of primitives being handled in order to achieve significant speedups compared to their CPU versions. The minimal size requirement can prove prohibitive for many applications. It can be loosened by batching operations to have sufficient amount of data...

chapter

Accelerating multi-scale flows for LDDKBM diffeomorphic registration

Stefan Sommer

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) > 499 - 505

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)

Registrations in medical imaging and computational anatomy can be obtained using the Large Deformation Diffeomorphic Kernel Bundle Mapping (LDDKBM) framework. This provides a registration algorithm with a solid mathematical foundation while incorporating regularization of deformation at multiple scales. Because the variational formulation of LDDKBM implies a heavy computational burden in the search...

chapter

Improving GPU Robustness by making use of faulty parts

Artem Durytskyy, Mohamed Zahran, Ramesh Karri

2011 IEEE 29th International Conference on Computer Design (ICCD) > 346 - 351

2011 IEEE 29th International Conference on Computer Design (ICCD 2011)

With hundreds of processing units in current state-of-the-art graphics processing units (GPUs), the probability that one or more processing units fail due to permanent faults, during fabrication or post deployment, increases drastically. In our experiments we found that the loss of a single streaming multiprocessor (SM) in an 8-SM GPU resulted in as much as 16%performance loss. The default method...

chapter

Using GPUs to accelerate FPGA wirelength estimate for use with complex search operators

Christian Fobel, Gary Grewal, Deborah Stacey

2011 24th Canadian Conference on Electrical and Computer Engineering(CCECE) > 1129 - 1134

2011 24th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)

As the precise wirelength for a given placement can only be known after routing, accurate and fast to compute wirelength estimates are required for FPGA placement algorithms. Two of the more effective wirelength estimation models are HPWL [1] and Star+ [2]. However, both of these models are expensive to compute requiring O(nm) time, where n is the number of nets and m is the average number of blocks...

chapter

Parallel cross-layer optimization of high-level synthesis and physical design

J Williamson, Yinghai Lu, Li Shang, Hai Zhou, more

16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011) > 467 - 472

2011 16th Asia and South Pacific Design Automation Conference, ASP-DAC 2011

Integrated circuit (IC) design automation has traditionally followed a hierarchical approach. Modern IC design flow is divided into sequentially-addressed design and optimization layers; each successively finer in design detail and data granularity while increasing in computational complexity. Eventual agreement across the design layers signals design closure. Obtaining design closure is a continual...

chapter

Acceleration of Functional Validation Using GPGPU

L Suresh, N Rameshan, M S Gaur, M Zwolinski, more

2011 Sixth IEEE International Symposium on Electronic Design, Test and Application > 211 - 216

2011 IEEE 6th International Workshop on Electronic Design, Test and Application (DELTA 2011)

Logic simulation of a VLSI chip is a computationally intensive process. There exists an urgent need to map functional validation algorithms onto parallel architectures to aid hardware designers in meeting time-to-market constraints. In this paper, we propose three novel methods for logic simulation of combinational circuits on GPGPUs. Initial experiments run on two methods using benchmark circuits...

chapter

Evaluating the potential of graphics processors for high performance embedded computing

Shuai Mu, Chenxi Wang, Ming Liu, Dongdong Li, more

2011 Design, Automation&Test in Europe > 1 - 6

2011 Design, Automation & Test in Europe

Today's high performance embedded computing applications are posing significant challenges for processing throughout. Traditionally, such applications have been realized on application specific integrated circuits (ASICs) and/or digital signal processors (DSP). However, ASICs' advantage in performance and power often could not justify the fast increasing fabrication cost, while current DSP offers...

chapter

CuMAPz: A tool to analyze memory access patterns in CUDA

Yooseong Kim, Aviral Shrivastava

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC) > 128 - 133

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC)

CUDA programming model provides a simple interface to program on GPUs, but tuning GPGPU applications for high performance is still quite challenging. Programmers need to consider several architectural details, and small changes in source code, especially on memory access pattern, affect performance significantly. This paper presents CuMAPz, a tool to compare the memory performance of a CUDA program...

chapter

Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing

B Betkaoui, D B Thomas, W Luk

2010 International Conference on Field-Programmable Technology > 94 - 101

2010 International Conference on Field-Programmable Technology (FPT 2010)

This paper provides the first comparison of performance and energy efficiency of high productivity computing systems based on FPGA (Field-Programmable Gate Array) and GPU (Graphics Processing Unit) technologies. The search for higher performance compute solutions has recently led to great interest in heterogeneous systems containing FPGA and GPU accelerators. While these accelerators can provide significant...

chapter

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

Shuai Che, J W Sheaffer, M Boyer, L G Szafaryn, more

IEEE International Symposium on Workload Characterization (IISWC'10) > 1 - 11

2010 IEEE International Symposium on Workload Characterization (IISWC 2010)

The recently released Rodinia benchmark suite enables users to evaluate heterogeneous systems including both accelerators, such as GPUs, and multicore CPUs. As Rodinia sees higher levels of acceptance, it becomes important that researchers understand this new set of benchmarks, especially in how they differ from previous work. In this paper, we present recent extensions to Rodinia and conduct a detailed...

chapter

A Micro-benchmark Suite for AMD GPUs

Ryan Taylor, Xiaoming Li

2010 39th International Conference on Parallel Processing Workshops > 387 - 396

2010 39th International Conference on Parallel Processing Workshops (ICPPW)

Optimizing programs for Graphic Processing Unit (GPU) requires thorough knowledge about the values of architectural features for the new computing platform. However, this knowledge is frequently unavailable, e.g., due to insufficient documentation, which is probably a result of the infancy of general purpose computing on the GPU. What makes the modeling of program performance on GPU even more difficult...

Filter options

Data set:
ieee
Keywords:
KERNEL
BENCHMARK TESTING
GRAPHICS PROCESSING UNIT
INSTRUCTION SETS

Publication date

Set your own date range

Keywords

COPROCESSORS (6)
COMPUTER GRAPHIC EQUIPMENT (5)
GPU (4)
CUDA (2)
ELECTRONIC DESIGN AUTOMATION (2)
FIELD PROGRAMMABLE GATE ARRAYS (2)
GPGPU (2)
HARDWARE (2)
MEMORY MANAGEMENT (2)
MULTIPROCESSING SYSTEMS (2)
PARALLEL PROCESSING (2)
VECTORS (2)
ACCURACY (1)
ADDRESS SEQUENCES (1)
ALU-FETCH OPERATION RATIO (1)
AMD (1)
AMD GPU (1)
AMD PIXEL SHADER (1)
AMD STREAMSDK (1)
ANALYTICAL MODEL (1)
APPLICATION SPECIFIC INTEGRATED CIRCUITS (1)
ARCHITECTURAL FEATURES (1)
ATI (1)
AUTOMATIC TUNING (1)
BASIC PROGRAM CHARACTERISTICS (1)
BENCHMARK (1)
BENCHMARK CIRCUIT (1)
BLAS (1)
BURST WRITE LATENCY (1)
CHANNEL ESTIMATION (1)
CIRCUIT COMPLEXITY (1)
CIRCUIT LAYOUT CAD (1)
CIRCUIT OPTIMISATION (1)
CODE OPTIMIZATION (1)
COMBINATIONAL CIRCUIT (1)
COMBINATIONAL CIRCUITS (1)
COMPUTATIONAL COMPLEXITY (1)
COMPUTATIONAL FLUID DYNAMICS (1)
COMPUTATIONAL MODELING (1)
COMPUTE SHADER MODES (1)
COMPUTER ARCHITECTURE (1)
COMPUTER GRAPHICS (1)
CONTEMPORARY CMP WORKLOADS (1)
CONVERGENCE (1)
CORRELATION (1)
CPU INSTRUCTION SET ARCHITECTURE (1)
CUDA SYSTEM (1)
DATA STRUCTURES (1)
DIGITAL SIGNAL PROCESSING (1)
DIGITAL SIGNAL PROCESSORS (1)
DOMAIN SIZE (1)
DSP (1)
EDA (1)
EDA DESIGN FLOW PROCESS (1)
EMBEDDED SYSTEMS (1)
ENGINES (1)
ESTIMATION (1)
FERMI (1)
FIELD-PROGRAMMABLE GATE ARRAY (1)
FPGA PROGRAMMING (1)
FPGA-BASED HYBRID-CORE SYSTEM (1)
FUNCTIONAL VALIDATION (1)
GENERAL PURPOSE COMPUTING (1)
GENERAL PURPOSE GRAPHICS PROCESSING UNITS (1)
GFLOPS (1)
GLOBAL READ (1)
GPU-BASED SYSTEM (1)
GRAPHIC PROCESSING UNIT (1)
GRAPHICS PROCESSORS (1)
GRAPHICS PROCESSORS UNIT (1)
HEART (1)
HETEROGENEOUS PARALLEL COMPUTATIONAL POWER (1)
HIGH LEVEL SYNTHESIS (1)
HIGH PERFORMANCE EMBEDDED COMPUTING (1)
HIGH PRODUCTIVITY COMPUTING SYSTEM (1)
HIGH-LEVEL SYNTHESIS (1)
HIGH-PRODUCTIVITY RECONFIGURABLE COMPUTER APPROACH (1)
HPEC BENCHMARK (1)
IC DESIGN AUTOMATION (1)
INTEGRATED CIRCUIT DESIGN (1)
INTEGRATED CIRCUIT LAYOUT (1)
INTEGRATED CIRCUIT MODELING (1)
INTEGRATED LOGIC CIRCUITS (1)
ITERATIVE METHODS (1)
LIBRARIES (1)
LINEAR ALGEBRA (1)
LOGIC GATES (1)
LOGIC SIMULATION (1)
MANY-CORE COMPUTATION (1)
MASSIVELY-PARALLEL GPU FLOORPLANNER (1)
MATHEMATICAL MODEL (1)
MEMORY ACCESS PATTERN (1)
MEMORY LATENCIES (1)
MICROBENCHMARK SUITES (1)
MULTI-CORE (1)
MULTICORE COMPUTATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options