Search results

Items from 1 to 20 out of 31 results

chapter

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices

Sangkuen Lee, Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 619 - 624

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Processing In Memory (PIM), the concept of integrating processing directly with memory, has been attracting a lot of attention since PIM can assist in overcoming the throughput limitation caused by data movement between CPU and memory. The challenge, however, is that it requires the programmers to have a deep understanding of the PIM architecture to maximize the benefits such as data locality and...

chapter

Towards a GraphBLAS Library in Chapel

Ariful Azad, Aydin Buluc

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1095 - 1104

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The adoption of a programming language is positively influenced by the breadth of its software libraries. Chapel is a modern andrelatively young parallel programming language. Consequently, not many domain-specific software libraries exists that are written for Chapel. Graph processing is an important domain with many applications in cyber security, energy, social networking, and health. Implementing...

chapter

PGAS Communication Runtime for Extreme Large Data Computation

Ryo Matsumiya, Toshio Endo

2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2) > 10 - 16

2016 Second International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)

For partitioned global address space (PGAS) runtimes, supporting out-of-core data computation is an important issue. Some researchers showed that flash SSDs are useful for out-of-core data computation.In this paper, we introduce ComEx-PM, a PGAS communication runtime. ComEx-PM supports out-of-core data computation using a flash SSD. ComEx-PM launched multiple processes in each node. Memory region...

chapter

Application of PGAS Programming to Power Grid Simulation

Bruce Palmer

2016 PGAS Applications Workshop (PAW) > 33 - 40

2016 PGAS Applications Workshop (PAW)

This paper will describe the application of the PGAS Global Arrays (GA) library to power grid simulations. The GridPACK™ framework has been designed to enable power grid engineers to develop parallel simulations of the power grid by providing a set of templates and libraries that encapsulate most of the details of parallel programming in higher level abstractions. The communication portions of the...

chapter

Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations

Qing Yi, Qian Wang, Huimin Cui

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture > 596 - 608

2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)

General purpose compilers aim to extract the best average performance for all possible user applications. Due to the lack of specializations for different types of computations, compiler attained performance often lags behind those of the manually optimized libraries. In this paper, we demonstrate a new approach, programmable composition, to enable the specialization of compiler optimizations without...

chapter

Software technologies coping with memory hierarchy of GPGPU clusters for stencil computations

Toshio Endo, Guanghao Jin

2014 IEEE International Conference on Cluster Computing (CLUSTER) > 132 - 139

2014 IEEE International Conference On Cluster Computing (CLUSTER)

Stencil computations, which are important kernels for CFD simulations, have been highly successful on GPGPU clusters, due to high memory bandwidth and computation speed of GPU accelerators. However, sizes of the computed domains are limited by small capacity of GPU device memory. In order to support larger domain sizes, we utilize the memory hierarchy of GPGPU clusters; larger host memory is used...

chapter

Bohrium: A Virtual Machine Approach to Portable Parallelism

Mads R.B. Kristensen, Simon A.F. Lund, Troels Blum, Kenneth Skovhede, more

2014 IEEE International Parallel & Distributed Processing Symposium Workshops > 312 - 321

2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW)

In this paper we introduce, Bohrium, a runtime-system for mapping vector operations onto a number of different hardware platforms, from simple multi-core systems to clusters and GPU enabled systems. In order to make efficient choices Bohrium is implemented as a virtual machine that makes runtime decisions, rather than a statically compiled library, which is the more common approach. In principle,...

chapter

Hybrid Programming Based on Matlab Engine Technology

Huan Li, Wenhua Ye

2013 International Conference on Computer Sciences and Applications > 454 - 457

2013 International Conference on Computer Sciences and Applications (CSA)

In this paper, a dual language hybrid programming based on Mat lab engine technology and example of implementation are described. A lot of Mat lab functions can be used by this technology effectively, which reducing the workload of the program, also it can inherite the excellent VC program interface, therefore it is a kind of good hybrid program design method for debugging hardware and software interfaces,...

chapter

Improving performance of JNA by using LLVM JIT compiler

Yu-Hsin Tsai, I-Wei Wu, I-Chun Liu, Jean Jyh-Jiun Shann

2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS) > 483 - 488

2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)

Java Native Access (JNA) has been proposed to alleviate the burden of programming in Java Native Interface (JNI). JNA allows programmer to call native functions without writing any JNI codes. However, JNA suffers from some performance degradation. To overcome this problem, in this paper, we modify the JNA source code and integrate the LLVM JIT compiler into JNA to improve the performance. Our experiment...

chapter

SOMETHINGit: A prototyping library for live and sound improvisation

Tomohiro Oda, Kumiyo Nakakoji, Yasuhiro Yamamoto

2013 1st International Workshop on Live Programming (LIVE) > 11 - 14

2013 1st International Workshop on Live Programming (LIVE)

Live programming can be considered an interaction with incomplete code. Dynamic languages embrace the similar style of programming, such as pair programming and prototyping in a review session. Static languages require a certain degree of completeness of code, such as type safety and namespace resolution. SOMETHINGit is a Smalltalk library that combines dynamic Smalltalk and static Haskell and VDM-SL...

chapter

A Transparent Collective I/O Implementation

Yongen Yu, Jingjin Wu, Zhiling Lan, Douglas H. Rudd, more

2013 IEEE 27th International Symposium on Parallel and Distributed Processing > 297 - 307

2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

I/O performance is vital for most HPC applications especially those that generate a vast amount of data with the growth of scale. Many studies have shown that scientific applications tend to issue small and noncontiguous accesses in an interleaving fashion, causing different processes to access overlapping regions. In such scenario, collective I/O is a widely used optimization technique. However,...

chapter

Code Reuse Prevention through Control Flow Lazily Check

Linbo Chen, Jianhui Jiang, Danqing Zhang

2012 IEEE 18th Pacific Rim International Symposium on Dependable Computing > 51 - 60

2012 IEEE 18th Pacific Rim International Symposium on Dependable Computing (PRDC)

Despite the numerous prevention and protection techniques that have been developed, the exploitation of memory corruption vulnerabilities still represents a serious threat to the security of software systems and networks. Because of the adoption of the write or execute only policy (W¨'X) and address space layout randomization (ASLR), modern operate systems have been strengthened against code injection...

chapter

Parallelism as a Concern in Java through Fork-join Synchronization Patterns

Cristian Mateos, Alejandro Zunino, Matias Hirsch

2012 12th International Conference on Computational Science and Its Applications > 49 - 56

2012 12th International Conference on Computational Science and Its Applications (ICCSA)

We are facing a hardware revolution given by the increasing availability of multicore computers, clusters, Grids, and combinations of these. Consequently, there is plenty of computational power, but today's programmers are not fully prepared to exploit parallelism and distribution. Particularly, Java has helped in handling the heterogeneity of such environments, but there is a lack of facilities to...

chapter

A Portable High-Productivity Approach to Program Heterogeneous Systems

Zeki Bozkus, Basilio B. Fraguela

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 163 - 173

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The exploitation of heterogeneous resources is becoming increasingly important for general purpose computing. Unfortunately, heterogeneous systems require much more effort to be programmed than the traditional single or even multi-core computers most programmers are familiar with. Not only new concepts, but also new tools with different restrictions must be learned and applied. Additionally, many...

chapter

A Lightweight C++ Interface to MPI

Simone Pellegrini, Radu Prodan, Thomas Fahringer

2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing > 3 - 10

2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

The Message Passing Interface (MPI) provides bindings for the three programming languages commonly used in High Performance Computing (HPC): C, C++ and Fortran. Unfortunately, MPI supports only the lowest common denominator of the three languages, providing a level of abstraction far lower than typical C++ libraries. Lately, after the decision of the MPI committee to deprecate and remove the C++ bindings...

chapter

Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language

C J Newburn, Byoungro So, Zhenying Liu, M McCool, more

International Symposium on Code Generation and Optimization (CGO 2011) > 224 - 235

2011 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2011)

Our ability to create systems with large amount of hardware parallelism is exceeding the average software developer's ability to effectively program them. This is a problem that plagues our industry. Since the vast majority of the world's software developers are not parallel programming experts, making it easy to write, port, and debug applications with sufficient core and vector parallelism is essential...

chapter

Customizable Composition and Parameterization of Hardware Design Transformations

Tim Todman, Qiang Liu, Wayne Luk, George Constantinides

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools > 595 - 602

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools (DSD)

A promising approach to high-level design is to start initially with an obvious but possibly inefficient design, and apply multiple transformations to meet design goals. Many hardware compilation tools support a fixed recipe of applying design transformations, but designers have few options to adapt the recipe without re-writing the tools themselves. In addition, complex transformations based on linear...

chapter

The study of experiment teaching in data structure

Liu Chengxia, Cai Ying, Li Ning, Wang Tiefeng

2010 5th International Conference on Computer Science&Education > 631 - 634

2010 5th International Conference on Computer Science & Education (ICCSE 2010)

For the new development of modern education, to enhance the practice part of teaching is the key approach to improve the teaching quality. In this paper the problems existed in the teaching process of data structure is expounded and analyzed in detail. New reformed experiment teaching methods are put forward and an instance is given in the end of the paper.

chapter

GPUMP: A Multiple-Precision Integer Library for GPUs

Kaiyong Zhao, Xiaowen Chu

2010 10th IEEE International Conference on Computer and Information Technology > 1164 - 1168

2010 IEEE 10th International Conference on Computer and Information Technology (CIT)

Multiple-precision integer operations are key components of many security applications; but unfortunately they are computationally expensive on contemporary CPUs. In this paper, we present our design and implementation of a multiple-precision integer library for GPUs which is implemented by CUDA. We report our experimental results which show that a significant speedup can be achieved by GPUs as compared...

chapter

cCell: A S2S Compiler with Profiling Support for the Cell BE Architecture

Xiuxiu Bai, Xingjun Zhang, Guofu Feng, Jinghua Feng, more

2010 10th IEEE International Conference on Computer and Information Technology > 2946 - 2951

2010 IEEE 10th International Conference on Computer and Information Technology (CIT)

In this paper, a source to source (S2S) compiler with profiling support is designed and implemented. The focus of this compiler is to convert the source code running in the homogeneous environment to the code that can be compiled and run under the Cell BE architecture. Combined with the runtime profiling mechanism, the S2S compiler records the optimization strategies and their effects, which can be...

Data set:
ieee
Keywords:
ARRAYS
PROGRAMMING
LIBRARIES
Publication type:
book

Publication date

Set your own date range

Keywords

PARALLEL PROGRAMMING (7)
APPLICATION PROGRAM INTERFACES (5)
HARDWARE (5)
OPTIMIZATION (5)
PARALLEL PROCESSING (5)
RUNTIME (5)
C++ LANGUAGE (4)
DATA MINING (4)
INDEXES (4)
PROGRAM PROCESSORS (4)
SOFTWARE (4)
API (3)
DATA STRUCTURES (3)
ENGINES (3)
MULTIPROCESSING SYSTEMS (3)
SOFTWARE LIBRARIES (3)
SYNCHRONIZATION (3)
VECTORS (3)
APPLICATION PROGRAMMING INTERFACE (2)
BENCHMARK TESTING (2)
COMPUTATIONAL MODELING (2)
COMPUTER ARCHITECTURE (2)
COMPUTER SCIENCE EDUCATION (2)
CONTAINERS (2)
COPROCESSORS (2)
DISTRIBUTED COMPUTING (2)
ELECTRONICS PACKAGING (2)
GPU (2)
HPC (2)
INSTRUCTION SETS (2)
JAVA (2)
KERNEL (2)
LAYOUT (2)
MESSAGE PASSING (2)
MESSAGE PASSING INTERFACE (2)
MICROPROCESSORS (2)
MPI (2)
OBJECT-ORIENTED PROGRAMMING (2)
PERFORMANCE EVALUATION (2)
PGAS (2)
PROGRAM COMPILERS (2)
PROGRAM DEBUGGING (2)
PROGRAMMABILITY (2)
PUBLIC KEY CRYPTOGRAPHY (2)
RANDOM ACCESS MEMORY (2)
REACTIVE POWER (2)
2^N-BASED NUMBER SYSTEM CALCULATION (1)
ADAPTIVE MESH REFINEMENT (1)
AGGREGATE REMOTE MEMORY COPY INTERFACE (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ARBB (1)
ARITHMETIC CALCULATION ALGORITHM (1)
ARRAY INTENSIVE COMPUTATIONS (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATIC PROGRAMMING (1)
AUTOMATIC SCALABLE QUEUE TEMPLATE (1)
BANDWIDTH (1)
BENCHMARK (1)
BLOCK-RECURSIVE NATURE (1)
BOOKS (1)
BRIDGES (1)
BUILT-IN DATA TYPE (1)
BUILT-IN SELF-TEST (1)
C LANGUAGE (1)
C++ (1)
C++0X CONCEPTS (1)
CCELL (1)
CELL BE ARCHITECTURE (1)
CHANNEL MODELS (1)
CHAPEL (1)
CODE OPTIMIZATIONS (1)
CODE REUSE ATTACK (1)
CODE TRANSFORMATIONS (1)
COLLECTIVE I/O (1)
COMMUNICATING SEQUENTIAL PROCESS (1)
COMMUNICATING SEQUENTIAL PROCESSES (1)
COMPILATION (1)
COMPILER (1)
COMPUTE UNIFIED DEVICE ARCHITECTURE (1)
COMPUTER GRAPHIC EQUIPMENT (1)
COMPUTER SCIENCE (1)
COMPUTERS (1)
COMPUTERS AND INFORMATION PROCESSING (1)
CONCEPT IDENTIFICATION (1)
CONTEXT (1)
CONTROL FLOW INTEGRITY (1)
CONTROL FLOW LAZILY CHECK (1)
CRYPTOGRAPHY (1)
CSP (1)
CUDA (1)
DATA ABSTRACTION (1)
DATA FLOW COMPUTING (1)
DATA FLOW PATTERNS (1)
DATA STRUCTURE (1)
DATA-PARALLEL (1)
DATA-PARALLEL LIBRARY-BASED PROGRAMMING (1)
DEADLOCK DETECTION (1)
more

INFONA - science communication portal

Search results

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices

Towards a GraphBLAS Library in Chapel

PGAS Communication Runtime for Extreme Large Data Computation

Application of PGAS Programming to Power Grid Simulation

Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations

Software technologies coping with memory hierarchy of GPGPU clusters for stencil computations

Bohrium: A Virtual Machine Approach to Portable Parallelism

Hybrid Programming Based on Matlab Engine Technology

Improving performance of JNA by using LLVM JIT compiler

SOMETHINGit: A prototyping library for live and sound improvisation

A Transparent Collective I/O Implementation

Code Reuse Prevention through Control Flow Lazily Check

Parallelism as a Concern in Java through Fork-join Synchronization Patterns

A Portable High-Productivity Approach to Program Heterogeneous Systems

A Lightweight C++ Interface to MPI

Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language

Customizable Composition and Parameterization of Hardware Design Transformations

The study of experiment teaching in data structure

GPUMP: A Multiple-Precision Integer Library for GPUs

cCell: A S2S Compiler with Profiling Support for the Cell BE Architecture

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options