Search results

chapter

Towards programmable address spaces

Andrew Gozillon, Paul Keir

2017 Federated Conference on Computer Science and Information Systems (FedCSIS) > 697 - 700

2017 Federated Conference on Computer Science and Information Systems (FedCSIS)

High-performance computing increasingly makes use of heterogeneous many-core parallelism. Individual processor cores within such systems are radically simpler than their predecessors; and tasks previously the responsibility of hardware, are delegated to software. Rather than use a cache, fast on-chip memory, is exposed through a handful of address space annotations; associating pointers with discrete...

chapter

SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory

Manjunath Gorentla Venkata, Ferrol Aderholdt, Zachary Parchman

2017 46th International Conference on Parallel Processing Workshops (ICPPW) > 145 - 154

2017 46th International Conference on Parallel Processing Workshops (ICPPW)

The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. Along with hierarchical-heterogeneous memory, the system typically has a high-performing network and a compute accelerator. This system architecture is not only effective for...

chapter

Enabling One-Sided Communication Semantics on ARM

Pavel Shamis, M. Graham Lopez, Gilad Shainer

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 805 - 813

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

In this paper, we present our work to enable optimized one-sided communication operations on the ARM v8 architecture using a high-performance InfiniBand network interconnect, as well as an evaluation of our implementation. For this study, we started with an OpenSHMEM implementation based on Open MPI/SHMEM, and combined it with the UCX framework and the XPMEM kernel extension for shared memory communication...

chapter

Automatic generation of fast BLAS3-GEMM: A portable compiler approach

Xing Su, Xiangke Liao, Jingling Xue

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) > 122 - 133

2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

GEMM is the main computational kernel in BLAS3. Its micro-kernel is either hand-crafted in assembly code or generated from C code by general-purpose compilers (guided by architecture-specific directives or auto-tuning). Therefore, either performance or portability suffers. We present a POrtable Compiler Approach, Poca, implemented in LLVM, to automatically generate and optimize this micro-kernel in...

chapter

Advanced mapping techniques for digital signal processors

Tomas Fryza, Roman Mego

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) > 213 - 217

2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)

This paper is focused on the hardware modeling and the algorithms mapping on the digital signal processor (DSP) with the very long instruction word (VLIW) architecture, such as TMS320C6000. The general methods to develop an efficient application for the target processor combine high- and/or low-level programming languages. Although the hardware capabilities of the nowadays processors and compilers...

chapter

Perilla: Metadata-Based Optimizations of an Asynchronous Runtime for Adaptive Mesh Refinement

Tan Nguyen, Didem Unat, Weiqun Zhang, Ann Almgren, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 945 - 956

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Hardware architecture is increasingly complex, urging the development of asynchronous runtime systems with advance resource and locality management supports. However, these supports may come at the cost of complicating the user interface while programming remains one of the major constraints to wide adoption of asynchronous runtimes in practice. In this paper, we propose a solution that leverages...

chapter

A Cluster-as-Accelerator Approach for SPMD-Free Data Parallelism

Maurizio Drocco, Claudia Misale, Marco Aldinucci

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 350 - 353

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

In this paper we present a novel approach for functional-style programming of distributed-memory clusters, targeting data-centric applications. The programming model proposed is purely sequential, SPMD-free and based on high-level functional features introduced since C++11 specification. Additionally, we propose a novel cluster-as-accelerator design principle. In this scheme, cluster nodes act as...

chapter

A Cluster-as-Accelerator Approach for SPMD-Free Data Parallelism

Maurizio Drocco, Claudia Misale, Marco Aldinucci

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) > 350 - 353

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

In this paper we present a novel approach for functional-style programming of distributed-memory clusters, targeting data-centric applications. The programming model proposed is purely sequential, SPMD-free and based on high-level functional features introduced since C++11 specification. Additionally, we propose a novel cluster-as-accelerator design principle. In this scheme, cluster nodes act as...

chapter

Archborn: an open source tool for automated generation of chip heterogeneous multiprocessor architectures

Sen Ma, Hongyuan Ding, Miaoqing Huang, David Andrews

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig) > 1 - 6

2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

Modern platform FPGAs are sufficiently dense to allow the assembly of a complete chip heterogeneous multiprocessor systems on chip (CHMPs) within a single die. Based on CHMP, every research group that sets out to explore how an application can be accelerated on an FPGA platform must firstly integrate processors, buses, memories, and support IP components into a base architecture prior to beginning...

chapter

Programming Model Elements for Hybrid Collaborative Adaptive Systems

Ognjen Scekic, Tommaso Schiavinotto, Dimitrios I. Diochnos, Michael Rovatsos, more

2015 IEEE Conference on Collaboration and Internet Computing (CIC) > 278 - 287

2015 IEEE Conference on Collaboration and Internet Computing (CIC)

Hybrid Diversity-aware Collective Adaptive Systems (HDA-CAS) is a new generation of socio-technical systems where both humans and machine peers complement each other and operate collectively to achieve their goals. These systems are characterized by the fundamental properties of hybridity and collectiveness, hiding from users the complexities associated with managing the collaboration and coordination...

chapter

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Justin M. Wozniak, Timothy G. Armstrong, Ketan C. Maheshwari, Daniel S. Katz, more

2015 IEEE International Conference on Cluster Computing > 482 - 485

2015 IEEE International Conference on Cluster Computing (CLUSTER)

Scripting languages such as Python and R have been widely adopted as tools for the productive development of scientific software because of the power and expressiveness of the languages and available libraries. However, deploying scripted applications on large-scale parallel computer systems such as the IBM Blue Gene/Q or Cray XE6 is a challenge because of issues including operating system limitations,...

chapter

Design and Development of Domain Specific Active Libraries with Proxy Applications

Istvan Zoltan Reguly, Gihan R. Mudalige, Michael B. Giles

2015 IEEE International Conference on Cluster Computing > 738 - 745

2015 IEEE International Conference on Cluster Computing (CLUSTER)

Representative applications are versatile tools to evaluate new programming approaches, techniques and optimisations as a way to ensure continued high performance on future computing architectures. They make experimentation much easier before adopting changes/insights into the large scientific codes. In this paper we demonstrate the important role played by representative/proxy applications in designing...

chapter

Automatic construction of printable return-oriented programming payload

Wenbiao Ding, Xiao Xing, Ping Chen, Zhi Xin, more

2014 9th International Conference on Malicious and Unwanted Software: The Americas (MALWARE) > 18 - 25

2014 9th International Conference on Malicious and Unwanted Software: "The Americas" (MALWARE)

Return-oriented programming is a kind of codereuse technique for attackers, which is very effective to bypass the DEP defense. However, the instruction snippet (we call it gadget) is often unprintable ¹. This shortcoming can limit the ROP attack to be deployed to practice, since non-ASCII scanning can detect such ROP payload. In this paper, we present a novel method that only uses the printable gadgets,...

chapter

System Software for the Computing System "Electronica SSBIS"

Victor Ivannikov, Sergey Gaisaryan, Alexander Tomilin

2014 Third International Conference on Computer Technology in Russia and in the Former Soviet Union > 118 - 124

2014 Third International Conference on Computer Technology in Russia and in the Former Soviet Union (SoRuCom)

System Software for the Computing System "Electronic a SSBIS"

chapter

Present

2014 International Conference on High Performance Computing & Simulation (HPCS) > 1 - 16

2014 International Conference on High Performance Computing & Simulation (HPCS)

article

Efficiently Securing Systems from Code Reuse Attacks

Mehmet Kayaalp, Meltem Ozsoy, Nael Abu Ghazaleh, Dmitry Ponomarev

IEEE Transactions on Computers > 2014 > 63 > 5 > 1144 - 1156

Code reuse attacks (CRAs) are recent security exploits that allow attackers to execute arbitrary code on a compromised machine. CRAs, exemplified by return-oriented and jump-oriented programming approaches, reuse fragments of the library code, thus avoiding the need for explicit injection of attack code on the stack. Since the executed code is reused existing code, CRAs bypass current hardware and...

chapter

A kind of embedded system development method based on C#

Huiyu Yang, Junjie Zhao, Yishan Zhang, Qingdong Ma

2013 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering (QR2MSE) > 2035 - 2038

2013 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering (QR2MSE)

Aiming at addressing difficulties in the development of embedded system, this paper proposes a new method. Based on the object-oriented concepts and by reference to the touching operating mode of the smart phones or tablet PCs, we achieved a new set of smart touching UI controls and device driver classes library in C# with the .NET Compact Framework environment. This method improves the programming...

chapter

A Decoupled Execution Paradigm for Data-Intensive High-End Computing

Yong Chen, Chao Chen, Xian-He Sun, William D. Gropp, more

2012 IEEE International Conference on Cluster Computing > 200 - 208

2012 IEEE International Conference on Cluster Computing (CLUSTER)

High-end computing (HEC) applications in critical areas of science and technology tend to be more and more data intensive. I/O has become a vital performance bottleneck of modern HEC practice. Conventional HEC execution paradigms, however, are computing-centric for computation intensive applications. They are designed to utilize memory and CPU performance and have inherent limitations in addressing...

chapter

Agent based platform for the design and simulation of wireless sensor networks

Abdelhakim Hamzi, Mouloud Koudil

2012 International Conference on Computer, Information and Telecommunication Systems (CITS) > 1 - 5

2012 International Conference on Computer, Information and Telecommunication Systems (CITS)

In this work we propose a flexible, generic and agent based platform for the design and simulation of wireless sensor networks (WSN) protocols and applications. This platform is independent from the physical architecture of nodes, where each node is considered as an autonomous agent having its own properties and behaviors according to the role it takes in the network. Once the description of the WSN...

chapter

FPM: A Flexible Programming Model for MPSoC on FPGA

Chao Wang, Xi Li, Junneng Zhang, Peng Chen, more

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum > 477 - 484

2012 26th IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

This paper proposes a flexible programming model (FPM), which addresses the automatic parallel execution for functional tasks on heterogeneous multiprocessors. Guided by the simply annotated source codes, a front-end source to source compiler is provided to identify the parallel regions and generate the sources codes. A runtime middleware analyzes the inter-task data dependencies and schedules the...

INFONA - science communication portal

Search results

Towards programmable address spaces

SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory

Enabling One-Sided Communication Semantics on ARM

Automatic generation of fast BLAS3-GEMM: A portable compiler approach

Advanced mapping techniques for digital signal processors

Perilla: Metadata-Based Optimizations of an Asynchronous Runtime for Adaptive Mesh Refinement

A Cluster-as-Accelerator Approach for SPMD-Free Data Parallelism

A Cluster-as-Accelerator Approach for SPMD-Free Data Parallelism

Archborn: an open source tool for automated generation of chip heterogeneous multiprocessor architectures

Programming Model Elements for Hybrid Collaborative Adaptive Systems

Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

Design and Development of Domain Specific Active Libraries with Proxy Applications

Automatic construction of printable return-oriented programming payload

System Software for the Computing System "Electronica SSBIS"

Present

Efficiently Securing Systems from Code Reuse Attacks

A kind of embedded system development method based on C#

A Decoupled Execution Paradigm for Data-Intensive High-End Computing

Agent based platform for the design and simulation of wireless sensor networks

FPM: A Flexible Programming Model for MPSoC on FPGA

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options