2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Items from 1 to 12 out of 12 results

chapter

Snowflake: A Lightweight Portable Stencil DSL

Nathan Zhang, Michael Driscoll, Charles Markley, Samuel Williams, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 795 - 804

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Stencil computations are not well optimized by general-purpose production compilers and the increased use of multicore, manycore, and accelerator-based systems makes the optimization problem even more challenging. In this paper we present Snowflake, a Domain Specific Language (DSL) for stencils that uses a "micro-compiler" approach, i.e., small, focused, domain-specific code generators....

chapter

Photomosaic Generation by Rearranging Subimages, with GPU Acceleration

Yi Yang, Yasuaki Ito, Koji Nakano

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 942 - 951

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The main contribution of this paper is to show a new photomosaic generation method by rearranging subimages of an image. In the photomosaic generation, an input image is divided into small subimages and they are rearranged such that the rearranged image reproduces another image given as a target image. Therefore, this problem can be considered as a combinatorial optimization problem to obtain the...

chapter

Optimal Bandwidth Selection for Kernel Regression Using a Fast Grid Search and a GPU

Chris Rohlfs, Mohamed Zahran

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 550 - 556

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

This study presents a new algorithm and corresponding statistical package for estimating optimal bandwidth for a nonparametric kernel regression. Kernel regression is widely used in Economics, Statistics, and other fields. The formula for the optimal "bandwidth," or smoothing parameter, is well-known. In practice, however, the computational demands of estimating the optimal bandwidth have...

chapter

Large-Scale Stochastic Learning Using GPUs

Thomas Parnell, Celestine Duenner, Kubilay Atasu, Manolis Sifalakis, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 419 - 428

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this work we propose an accelerated stochastic learning system for very large-scale applications. Acceleration is achieved by mapping the training algorithm onto massively parallel processors: we demonstrate a parallel, asynchronous GPU implementation of the widely used stochastic coordinate descent/ascent algorithm that can provide up to 35× speed-up over a sequential CPU implementation. In order...

chapter

Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors

Kaixi Hou, Wu-chun Feng, Shuai Che

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 713 - 722

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Because sparse matrix-vector multiplication (SpMV) is an important and widely used computational kernel in many real-world applications, it behooves us to accelerate SpMV on modern multi- and many-core architectures. While many storage formats have been developed to facilitate SpMV operations, the compressed sparse row (CSR) format is still the most popular and general storage format. However, parallelizing...

chapter

A Pluggable Framework for Composable HPC Scheduling Libraries

Max Grossman, Vivek Kumar, Nick Vrvilo, Zoran Budimlic, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 723 - 732

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Driven by the increasing diversity of current and future HPC hardware and software platforms, the HPC community has seen a dramatic increase in research and development efforts into the composability of discrete software systems. While modularity is often desirable from a software engineering, quality assurance, and maintainability perspective, the barriers between software components often hide optimization...

chapter

Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper

Luigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1434 - 1443

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper we investigate an emerging application, 3D scene understanding, likely to be significant in the mobile space in the near future. The goal of this exploration is to reduce execution time while meeting our quality of result objectives. In previous work, we showed for the first time that it is possible to map this application to power constrained embedded systems, highlighting that decision...

chapter

Exploring Translation of OpenMP to OpenACC 2.5: Lessons Learned

Sergio Pino, Lori Pollock, Sunita Chandrasekaran

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 673 - 682

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Scientists who want to exploit the computing power of the latest parallel architectures are faced with a diverse set of architectures and a number of programming languages, models and approaches. Among several such programming techniques are directive-based programming models, OpenMP and OpenACC. This paper explores the similarities and the functionality gaps between both models and presents insights...

chapter

Training Many Neural Networks in Parallel via Back-Propagation

Javier A. Cruz-Lopez, Vincent Boyer, Didier El-Baz

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 501 - 509

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

This paper presents two parallel implementationsof the Back-propagation algorithm, a widely used approach forArtificial Neural Networks (ANNs) training. These implementationspermit one to increase the number of ANNs trainedsimultaneously taking advantage of the thread-level massiveparallelism of GPUs and multi-core architecture of modernCPUs, respectively. Computational experiments are carried outwith...

chapter

RAI: A Scalable Project Submission System for Parallel Programming Courses

Abdul Dakkak, Carl Pearson, Cheng Li, Wen-mei Hwu

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 315 - 322

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

A major component of many advanced programming courses is an open-ended "end-of-term project" assignment. Delivering and evaluating open-ended parallel programming projects for hundreds or thousands of students brings a need for broad system reconfigurability coupled with challenges of testing and development uniformity, access to esoteric hardware and programming environments, scalability,...

chapter

Time and Energy to Solution Evaluation for the Three-Point Angular Correlation Function

Antonio Gomez-Iglesias, Miguel Cardenas-Montes

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 703 - 712

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

This paper analyzes the performance of different implementations of a three-point angular correlation function. This function is used in the study of large scale distribution of galaxies in a variety of computational platforms. The function is based on histogram construction and presents a large computational cost. This cost dramatically increases with the size of the datasets. The implementation...

chapter

Accelerating the Smith-Waterman Algorithm Using Bitwise Parallel Bulk Computation Technique on GPU

Takahiro Nishimura, Jacir L. Bordim, Yasuaki Ito, Koji Nakano

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 932 - 941

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The bulk execution of a sequential algorithm is to execute it for many different inputs in turn or at the same time. It is known that the bulk execution of an oblivious sequential algorithm can be implemented to run efficiently on a GPU. The bulk execution supports fine grained bitwise parallelism, allowing it to achieve high acceleration over a straightforward sequential computation. The main contribution...

Filter options

Keywords:
GPU

Publication date

Set your own date range

Keywords

GRAPHICS PROCESSING UNITS (8)
COMPUTATIONAL MODELING (4)
BENCHMARK TESTING (3)
MACHINE LEARNING (3)
ALGORITHM DESIGN AND ANALYSIS (2)
CUDA (2)
HARDWARE (2)
INSTRUCTION SETS (2)
KERNEL (2)
MULTICORE (2)
OPENACC (2)
PROGRAMMING (2)
TOOLS (2)
TRAINING (2)
3PACF (1)
ACCELERATION (1)
ACCELERATORS (1)
AMD (1)
APU (1)
ARRAYS (1)
ASYNCHRONOUS LEARNING (1)
ATOM (1)
AUTO-TUNING (1)
BACK-PROPAGATION (1)
BINNING (1)
BIOLOGICAL NEURAL NETWORKS (1)
BITWISE OPERATIONS (1)
BOUNDARY CONDITIONS (1)
BULK COMPUTATION (1)
CAMERAS (1)
CLASSIFICATION (1)
COMBINATIONAL CIRCUITS (1)
COMBINATORIAL OPTIMIZATION PROBLEM (1)
COMPOSABILITY (1)
COMPUTER ARCHITECTURE (1)
COMPUTER VISION (1)
CONFERENCES (1)
CONVERGENCE (1)
CORRELATION (1)
CROSS-VALIDATION (1)
CROWD- SOURCING (1)
CSR (1)
DARK ENERGY (1)
DATA MODELS (1)
DESIGN SPACE EXPLORATION (1)
DISTRIBUTED COMPUTING (1)
DISTRIBUTED PROCESSING (1)
DOMAIN-SPECIFIC LANGUAGE (1)
DSL (1)
DYNAMIC PROGRAMMING (1)
EMBEDDED SYSTEMS (1)
FIELD PROGRAMMABLE GATE ARRAYS (1)
FPGA (1)
GRAPH PROCESSING (1)
HETEROGENEOUS (1)
HEURISTIC ALGORITHMS (1)
HISTOGRAMS (1)
HPC (1)
LIBRARIES (1)
LOAD MODELING (1)
MASSIVE OPEN ONLINE COURSES (1)
MATHEMATICAL MODEL (1)
MEASUREMENT (1)
MPI (1)
MULTIPROCESSING (1)
NEURAL NETWORKS (1)
NEURONS (1)
NONPARAMETRIC (1)
ONLINE EDUCATION (1)
OPENCL (1)
OPENMP (1)
OPENSHMEM (1)
OPTIMAL BANDWIDTH (1)
PARALLEL ALGORITHM (1)
PARALLEL ALGORITHMS (1)
PARALLEL PROGRAMMING (1)
PERFORMANCE (1)
PERFORMANCE TUNING (1)
PHOTOMOSAIC (1)
POWER (1)
PREDICTION ALGORITHMS (1)
PRODUCT DEMAND FORECASTING (1)
PROGRAMMING EDUCATION (1)
PROGRAMMING PROFESSION (1)
PYTHON (1)
REGRESSION (1)
RUNTIME (1)
SCIENTIFIC COMPUTING (1)
SECURITY (1)
SERVERS (1)
SIGNAL PROCESSING ALGORITHMS (1)
SIMULTANEOUS LOCALIZATION AND MAPPING (1)
SLAM (1)
SMITH-WATERMAN (1)
SOFTWARE (1)
SPARSE MATRICES (1)
SPMV (1)
STANDARDS (1)
STOCHASTIC PROCESSES (1)
more

INFONA - science communication portal

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)