Search results for: A Nukada

Items from 1 to 7 out of 7 results

chapter

Low-overhead diskless checkpoint for hybrid computing systems

L B Gomez, A Nukada, N Maruyama, F Cappello, more

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

As the size of new supercomputers scales to tens of thousands of sockets, the mean time between failures (MTBF) is decreasing to just several hours and long executions need some kind of fault tolerance method to survive failures. Checkpoint\Restart is a popular technique used for this purpose; but writing the state of a big scientific application to remote storage will become prohibitively expensive...

chapter

An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code

T Shimokawabe, T Aoki, C Muroi, J Ishida, more

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 11

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Regional weather forecasting demands fast simulation over fine-grained grids, resulting in extremely memory- bottlenecked computation, a difficult problem on conventional supercomputers. Early work on accelerating mainstream weather code WRF using GPUs with their high memory performance, however, resulted in only minor speedup due to partial GPU porting of the huge code. Our full CUDA porting of the...

chapter

Statistical power modeling of GPU kernels using performance counters

H Nagasaka, N Maruyama, A Nukada, T Endo, more

International Conference on Green Computing > 115 - 122

2010 International Conference on Green Computing (Green Comp)

We present a statistical approach for estimating power consumption of GPU kernels. We use the GPU performance counters that are exposed for CUDA applications, and train a linear regression model where performance counters are used as independent variables and power consumption is the dependent variable. For model training and evaluation, we use publicly available CUDA applications, consisting of 49...

chapter

Aspects of GPU for general purpose high performance computing

R. Suda, T. Aoki, S. Hirasawa, A. Nukada, more

2009 Asia and South Pacific Design Automation Conference > 216 - 223

ASP-DAC 2009. 14th Asia and South Pacific Design Automation Conference

We discuss hardware and software aspects of GPGPU, specifically focusing on NVIDIA cards and CUDA, from the viewpoints of parallel computing. The major weak points of GPU against newest supercomputers are identified to be and summarized as only four points: large SIMD vector length, small memory, absence of fast L2 cache, and high register spill penalty. As software concerns, we derive optimal scheduling...

chapter

Bandwidth intensive 3-D FFT kernel for GPUs using CUDA

A. Nukada, Y. Ogata, T. Endo, S. Matsuoka

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis > 1 - 11

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

Most GPU performance ldquohypesrdquo have focused around tightly-coupled applications with small memory bandwidth requirements e.g., N-body, but GPUs are also commodity vector machines sporting substantial memory bandwidth; however, effective programming methodologies thereof have been poorly studied. Our new 3-D FFT kernel, written in NVIDIA CUDA, achieves nearly 80 GFLOPS on a top-end GPU, being...

chapter

FFTSS: A High Performance Fast Fourier Transform Library

A. Nukada

2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings > 3 > III

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing

In this paper, we introduce a new fast Fourier transform (FFT) library. In developing this software, we focus on the efficient execution of the floating-point operation instructions. To achieve high performance on various processors, we provide the source code which compilers can optimize easily. Since the compilers provided by processor vendors have powerful optimizers for loop sentences, the code...

chapter

LAPACK in SILC: use of a flexible application framework for matrix computation libraries

T. Kajiyama, A. Nukada, R. Suda, A. Nishida, more

Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'5) > 8 pp. - 212

Proceedings. Eighth International Conference on High-Performance Computing in Asia-Pacific Region

This paper presents a novel application framework named simple interface for library collections (SILC) that allows users to make use of matrix computation libraries in a flexible and language-independent manner. Using SILC, various computing environments as well as alternative solvers and matrix storage formats from different libraries can be easily utilized. The present paper describes the design...

Filter options

Publication date

Set your own date range

Keywords

COMPUTER GRAPHIC EQUIPMENT (5)
KERNEL (3)
REGISTERS (3)
BANDWIDTH (2)
COPROCESSORS (2)
FAST FOURIER TRANSFORMS (2)
GRAPHICS PROCESSING UNIT (2)
GRAPHICS PROCESSING UNITS (2)
PROGRAMMING (2)
SOFTWARE LIBRARIES (2)
SUPERCOMPUTER (2)
APPLICATION PROGRAM INTERFACES (1)
ARRAYS (1)
ASUCA (1)
BANDWIDTH INTENSIVE 3D FFT KERNEL (1)
CHECKPOINT-RESTART (1)
CHECKPOINTING (1)
CLOCKS (1)
COMPILERS (1)
COMPUTER ARCHITECTURE (1)
COMPUTER GRAPHICS (1)
CPU (1)
CUDA APPLICATIONS (1)
CUDA PORTING (1)
CUDA PROGRAMMING LANGUAGE (1)
DISCRETE WAVELET TRANSFORMS (1)
DISTRIBUTED SHARED MEMORY SYSTEMS (1)
ENCODING (1)
FAULT TOLERANCE (1)
FAULT TOLERANCE METHOD (1)
FAULT TOLERANT COMPUTING (1)
FAULT TOLERANT SYSTEMS (1)
FFTSS (1)
FINE-GRAINED GRIDS (1)
FLOATING POINT ARITHMETIC (1)
FLOATING-POINT OPERATION INSTRUCTIONS (1)
GENERAL PURPOSE HIGH PERFORMANCE COMPUTING (1)
GPGPU SYSTEM (1)
GPU (1)
GPU KERNELS (1)
GPU TSUBAME SUPERCOMPUTER (1)
GPU-ACCELERATED CLUSTER (1)
GRAPHIC PROCESSING UNIT (1)
GRID COMPUTING (1)
HARDWARE (1)
HDC TECHNIQUE (1)
HIGH PERFORMANCE FAST FOURIER TRANSFORM LIBRARY (1)
HOST-DEVICE DATA TRANSFER (1)
HYBRID COMPUTING SYSTEM (1)
LAPACK (1)
LINEAR REGRESSION MODEL (1)
LOW-OVERHEAD DISKLESS CHECKPOINT (1)
MAINFRAMES (1)
MATRICES (1)
MATRIX ALGEBRA (1)
MATRIX COMPUTATION LIBRARY (1)
MATRIX STORAGE FORMATS (1)
MEAN TIME BETWEEN FAILURE (1)
MEMORY-BOTTLENECKED COMPUTATION (1)
MTBF (1)
MULTICORE PROCESSING (1)
NONHYDROSTATIC WEATHER MODEL ASUCA PRODUCTION CODE (1)
NVIDIA CARD (1)
NVIDIA CUDA (1)
NVIDIA GT200 TESLA (1)
OFFCARD BANDWIDTH LIMITATION (1)
OPTIMAL SCHEDULING ALGORITHM (1)
PARALLEL COMPUTING (1)
PARALLEL MACHINES (1)
PARALLEL PROCESSING (1)
PERFORMANCE COUNTERS (1)
POWER AWARE COMPUTING (1)
POWER CONSUMPTION (1)
POWER CONSUMPTION ESTIMATION (1)
PROGRAM COMPILERS (1)
RADIATION DETECTORS (1)
REED-SOLOMON CODES (1)
REGIONAL WEATHER FORECASTING DEMANDS (1)
REGRESSION ANALYSIS (1)
SCHEDULING (1)
SHARED-MEMORY PARALLEL COMPUTING (1)
SILC (1)
SIMD VECTOR LENGTH (1)
SIMPLE INTERFACE FOR LIBRARY COLLECTIONS (1)
SOFTWARE FAULT TOLERANCE (1)
SOURCE CODE (1)
SOURCE CODING (1)
SPECIFICATION LANGUAGES (1)
SPMD PARALLELISM (1)
STATISTICAL POWER MODELING (1)
SUPERCOMPUTERS (1)
TFLOPS FULL GPU ACCELERATION (1)
TOKYO INSTITUTE OF TECHNOLOGY (1)
TSUBAME 2.0 (1)
WEATHER FORECASTING (1)
more

INFONA - science communication portal

Search results for: A Nukada

Low-overhead diskless checkpoint for hybrid computing systems

An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code

Statistical power modeling of GPU kernels using performance counters

Aspects of GPU for general purpose high performance computing

Bandwidth intensive 3-D FFT kernel for GPUs using CUDA

FFTSS: A High Performance Fast Fourier Transform Library

LAPACK in SILC: use of a flexible application framework for matrix computation libraries

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options