Search results

Items from 1 to 17 out of 17 results

chapter

Automatic OpenCL Code Generation for Multi-device Heterogeneous Architectures

Pei Li, Elisabeth Brunet, Francois Trahay, Christian Parrot, more

2015 44th International Conference on Parallel Processing > 959 - 968

2015 44th International Conference on Parallel Processing (ICPP)

Using multiple accelerators, such as GPUs or Xeon Phis, is attractive to improve the performance of large data parallel applications and to increase the size of their workloads. However, writing an application for multiple accelerators remains today challenging because going from a single accelerator to multiple ones indeed requires to deal with potentially non-uniform domain decomposition, inter-accelerator...

chapter

XcalableACC: Extension of XcalableMP PGAS Language Using OpenACC for Accelerator Clusters

Masahiro Nakao, Hitoshi Murai, Takenori Shimosaka, Akihiro Tabuchi, more

2014 First Workshop on Accelerator Programming using Directives > 27 - 36

2014 First Workshop on Accelerator Programming using Directives (WACCPD)

The present paper introduces the XcalableACC (XACC) programming model, which is a hybrid model of the XcalableMP (XMP) Partitioned Global Address Space (PGAS) language and OpenACC. XACC defines directives that enable programmers to mix XMP and OpenACC directives in order to develop applications that can use accelerator clusters with ease. Moreover, in order to improve the performance of stencil applications,...

chapter

Evaluation of the Global Address Space Programming Interface (GASPI)

Jens Breitbart, Mareike Schmidtobreick, Vincent Heuveline

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 717 - 726

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

The first exascale supercomputers are expected by the end of this decade and will presumably feature an increase in core count, but a decrease in the amount of memory available per core. As of now, it is still unclear if the current programming models will provide high performance on exascale systems. One programming model considered to be an alternative to MPI is the so-called partitioned global...

chapter

A compiler framework for automatically mapping data parallel programs to heterogeneous MPSoCs

Kiran Chandramohan, Michael F. P. O'Boyle

2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES) > 1 - 10

2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)

Many of today's embedded devices are based on MultiProcessor System-on-Chips(MPSoCs) Such devices are usually heterogeneous, containing DSPs and specialized accelerators as well as one or more CPUs. This heterogeneity allows efficient implementations in specialized domains but is a barrier to their wider use. They are difficult to program as only the CPU is directly exposed to the programmer with...

chapter

A Transparent Collective I/O Implementation

Yongen Yu, Jingjin Wu, Zhiling Lan, Douglas H. Rudd, more

2013 IEEE 27th International Symposium on Parallel and Distributed Processing > 297 - 307

2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

I/O performance is vital for most HPC applications especially those that generate a vast amount of data with the growth of scale. Many studies have shown that scientific applications tend to issue small and noncontiguous accesses in an interleaving fashion, causing different processes to access overlapping regions. In such scenario, collective I/O is a widely used optimization technique. However,...

chapter

Exploring Deterministic Shared Memory Programming Model

Yu Zhang, Wei Hu

2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies > 144 - 149

2012 13th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT)

Deterministic parallelism promises many benifits for parallel programming. Exist deterministic runtimes, however, either require modifying programs to use a restricted set of synchronization primitives, or limit calability by relying on a centralized, deterministic thread scheduler. We addressed these challenges with single producer multi-consumer (SPMC) virtual memory as a possible foundation for...

chapter

On-the-Fly Synchronization Checking for Interactive Programming in XcalableMP

Tatsuya Abe, Mitsuhisa Sato

2012 41st International Conference on Parallel Processing Workshops > 29 - 37

2012 41st International Conference on Parallel Processing Workshops (ICPPW)

Xcalable MP (XMP) is a partitioned global address space language, which is directive based. In XMP, programmers can include explicit synchronizations by adding directives to their source code. In this sense, XMP provides programmers with performance awareness. As such, part of the performance of programs can be attributed to the programmers, i.e., XMP requires interactive programming by the programmers...

chapter

Parallelism as a Concern in Java through Fork-join Synchronization Patterns

Cristian Mateos, Alejandro Zunino, Matias Hirsch

2012 12th International Conference on Computational Science and Its Applications > 49 - 56

2012 12th International Conference on Computational Science and Its Applications (ICCSA)

We are facing a hardware revolution given by the increasing availability of multicore computers, clusters, Grids, and combinations of these. Consequently, there is plenty of computational power, but today's programmers are not fully prepared to exploit parallelism and distribution. Particularly, Java has helped in handling the heterogeneity of such environments, but there is a lack of facilities to...

chapter

An Empirical Performance Study of Chapel Programming Language

Nan Dun, Kenjiro Taura

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 497 - 506

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper we evaluate the performance of the Chapel programming language from the perspective of its language primitives and features, where the micro benchmarks are synthesized from our lessons learned in developing molecular dynamics simulation programs in Chapel. Experimental results show that most language building blocks have comparable performance to corresponding hand-written C code, while...

chapter

Implementation of XcalableMP Device Acceleration Extention with OpenCL

Takuma Nomizu, Daisuke Takahashi, Jinpil Lee, Taisuke Boku, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 2394 - 2403

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Due to their outstanding computational performance, many acceleration devices, such as GPUs, the Cell Broadband Engine (Cell/B.E.), and multi-core computing are attracting a lot of attention in the field of high-performance computing. Although there are many programming models and languages de-signed for programming accelerators, such as CUDA, AMD Accelerated Parallel Processing (AMD APP), and OpenCL,...

chapter

Fast and lightweight support for nested parallelism on cluster-based embedded many-cores

Andrea Marongiu, Paolo Burgio, Luca Benini

2012 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 105 - 110

2012 Design, Automation & Test in Europe Conference & Exhibition (DATE 2012)

Several recent many-core accelerators have been architected as fabrics of tightly-coupled shared memory clusters. A hierarchical interconnection system is used - with a crossbar-like medium inside each cluster and a network-on-chip (NoC) at the global level - which make memory operations non-uniform (NUMA). Nested parallelism represents a powerful programming abstraction for these architectures, where...

chapter

Partial globalization of partitioned address spaces for zero-copy communication with shared memory

Fangzhou Jiao, Nilesh Mahajan, Jeremiah Willcock, Arun Chauhan, more

2011 18th International Conference on High Performance Computing > 1 - 10

2011 18th International Conference on High Performance Computing (HiPC)

We have developed a high-level language, called Kanor, for declaratively specifying communication in parallel programs. Designed as an extension of C++, it serves to coordinate partitioned address space programs written in the bulk synchronous parallel (BSP) style. Kanor's declarative semantics enable the programmers to write correct and maintainable parallel applications. The communication abstraction...

chapter

Formal Refinement of BSP Programs with Early Cost Evaluation

Virginia Niculescu

2011 10th International Symposium on Parallel and Distributed Computing > 49 - 56

2011 10th International Symposium on Parallel and Distributed Computing (ISPDC)

The paper presents a method that allows formal refinement of BSP programs. We may consider a BSP program as a set of parameterized processes that communicate via message-passing. A parameterized process is refined into a sequence of BSP super steps each containing an ordinary sequential process and a communication process. The method uses parameterized pre- and post-conditions, and takes into account...

chapter

Gossamer: A Lightweight Approach to Using Multicore Machines

Joseph A Roback, Gregory R Andrews

2010 39th International Conference on Parallel Processing > 30 - 39

39th International Conference on Parallel Processing (ICPP 2010)

The key to performance improvements in the multi-core era is for software to utilize the available concurrency. This paper presents a lightweight programming framework called Gossamer that is easy to use, enables the solution of a broad range of parallel programming problems, and produces efficient code. Gossamer contains (1) a set of high-level annotations that one adds to a sequential program to...

article

Programming Experiences Using the X10 Language

M Tajchman

Computing in Science & Engineering > 2010 > 12 > 6 > 62 - 69

Future large-scale parallel heterogeneous machines will likely need new programming approaches; X10 proposes one such alternative.

chapter

The Research and Application of Apla-Java Reusable Components

Jie Anquan, Wan Lan, Hua Zhizhang, Xue Jinyun

2008 International Symposium on Computer Science and Computational Technology > 1 > 356 - 359

2008 International Symposium on Computer Science and Computational Technology (ISCSCT)

Software reuse technology can improve the efficiency of program development greatly. A reusable Apla-Java component has been developed in the research of PAR (Partition and Recur) method and their tools. We have made the most of reuse-driven software theory and the partial implementation theory for reference which ensure the accuracy of the components effectively. Apla-Java component is an important...

chapter

OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems

J. Lee, M. Sato, T. Boku

2008 International Conference on Parallel Processing - Workshops > 121 - 128

2008 International Conference on Parallel Processing Workshops (ICPP-W)

Open MPD is a language extension for programming on distributed memory systems that helps users by having minimal and simple notations. Although MPI is the de facto standard for parallel programming on distributed memory systems, writing MPI programs is often a time-consuming and complicated process. Open MPD supports typical parallelization-based on the data parallel paradigm and work sharing, and...

Filter options

Data set:
ieee
Keywords:
SYNCHRONIZATION
PROGRAMMING
ARRAYS

Publication date

Set your own date range

Publication type

book (16)
article (1)

Keywords

INSTRUCTION SETS (5)
PARALLEL PROCESSING (5)
COMPUTATIONAL MODELING (4)
PARALLEL PROGRAMMING (4)
LIBRARIES (3)
PARALLEL (3)
ALGORITHM DESIGN AND ANALYSIS (2)
DISTRIBUTED COMPUTING (2)
ELECTRONICS PACKAGING (2)
JAVA (2)
KERNEL (2)
OBJECT ORIENTED MODELING (2)
ONE-SIDED COMMUNICATION (2)
OPENCL (2)
PARALLEL LANGUAGES (2)
PERFORMANCE EVALUATION (2)
PROGRAM PROCESSORS (2)
RECEIVERS (2)
RUNTIME (2)
SHARED MEMORY (2)
SOFTWARE (2)
SYNTACTICS (2)
ABSTRACTION (1)
ABSTRACTS (1)
ACCELERATION (1)
ACCELERATOR (1)
ACCELERATORS (1)
ANNOTATIONS (1)
APLA-JAVA (1)
APLA-JAVA REUSABLE COMPONENTS (1)
APPLICATION PROGRAM INTERFACES (1)
BENCHMARK TESTING (1)
C (1)
CLUSTER (1)
CODE GENERATION (1)
COLLECTIVE I/O (1)
COMPILER (1)
COMPILER, SPMD, DATA (1)
COMPILERS (1)
COMPLEXITY THEORY (1)
COMPUTER LANGUAGES (1)
COMPUTERS (1)
CONCURRENCY CONTROL (1)
CONCURRENT COMPUTING (1)
DATA PARALLELISM (1)
DATA-DISTRIBUTION (1)
DESIGN LANGUAGE; DEVELOPMENT COMPILER; ACCELERATOR CLUSTER; PARTITIONED GLOBAL ADDRESS SPACE LANGUAGE (1)
DETERMINISTIC PARALLELISM (1)
DIGITAL SIGNAL PROCESSING (1)
DIRECTIVE (1)
DIRECTIVE-BASED DATA PARALLEL LANGUAGE EXTENSION (1)
DISTRIBUTED MEMORY SYSTEM (1)
DISTRIBUTED MEMORY SYSTEMS (1)
DISTRIBUTED PROCESSING (1)
DOMAIN DECOMPOSITION (1)
FILE SYSTEMS (1)
FORK-JOIN SYNCHRONIZATION PATTERNS (1)
GASPI (1)
GLOBAL ADDRESSING SPACE (1)
GOSSAMER (1)
GRAPHICS PROCESSING UNIT (1)
GRAPHICS PROCESSING UNITS (1)
HARDWARE (1)
HETEROGENEOUS ARCHITECTURES (1)
HETEROGENEOUS MACHINES (1)
HETEROGENEOUS PROCESSORS (1)
HIGH PERFORMANCE COMPUTING (1)
HIGH-LEVEL ANNOTATIONS (1)
HPC (1)
I/O INTENSIVE APPLICATIONS (1)
INDEXES (1)
ITERATIVE PARALLELISM (1)
JACOBIAN MATRICES (1)
LANGUAGES (1)
LARGE SCALE PARALLEL MACHINES (1)
LAYOUT (1)
LIGHTWEIGHT PROGRAMMING FRAMEWORK (1)
MAPREDUCE COMPUTATIONS (1)
MEMORY ARCHITECTURE (1)
MEMORY MANAGEMENT (1)
MESSAGE PASSING (1)
MESSAGE PASSING INTERFACE (1)
MESSAGE SYSTEMS (1)
MODEL (1)
MPI (1)
MPI CODING (1)
MULTI-CORE PROGRAMMING (1)
MULTICORE MACHINES (1)
MULTICORE PROCESSING (1)
MULTICORE PROGRAMMING (1)
MULTIPROCESSING SYSTEMS (1)
ON-THE-FLY CHECKING (1)
OPENMPD (1)
OPTIMIZATION (1)
OWNERSHIP TRANSFER (1)
PAR METHOD (1)
PARALLEL COMPUTATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options