Search results

chapter

Automatic Scan Parallelization in OpenMP

Maicol Zegarra, Marcio Pereira, Xavier Martorell, Guido Araujo

2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW) > 85 - 90

2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)

Prefix Scan (or simply scan) is an operator that computes all the partial sums of a vector. A scan operation results in a vector where each element is the sum of the preceding elements in the original vector up to the corresponding position. Scan is a key operation in many relevant problems like sorting, lexical analysis, string comparison, image filtering among others. Although there are libraries...

article

Trends in Data Locality Abstractions for HPC Systems

Didem Unat, Anshu Dubey, Torsten Hoefler, John Shalf, more

IEEE Transactions on Parallel and Distributed Systems > 2017 > 28 > 10 > 3007 - 3020

The cost of data movement has always been an important concern in high performance computing (HPC) systems. It has now become the dominant factor in terms of both energy consumption and performance. Support for expression of data locality has been explored in the past, but those efforts have had only modest success in being adopted in HPC applications for various reasons. them However, with the increasing...

chapter

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Francois Tessier, Venkatram Vishwanath, Emmanuel Jeannot

2017 IEEE International Conference on Cluster Computing (CLUSTER) > 70 - 80

2017 IEEE International Conference on Cluster Computing (CLUSTER)

Reading and writing data efficiently from storage system is necessary for most scientific simulations to achieve good performance at scale. Many software solutions have been developed to decrease the I/O bottleneck. One well-known strategy, in the context of collective I/O operations, is the two-phase I/O scheme. This strategy consists of selecting a subset of processes to aggregate contiguous pieces...

chapter

Integrating productivity-oriented programming languages with high-performance data structures

Rohit Varkey Thankachan, Eric R. Hein, Brian P. Swenson, James P. Fairbanks

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 8

2017 IEEE High Performance Extreme Computing Conference (HPEC)

This paper shows that Julia provides sufficient performance to bridge the performance gap between productivity-oriented languages and low-level languages for complex memory intensive computation tasks such as graph traversal. We provide performance guidelines for using complex low-level data structures in high productivity languages and present the first parallel integration on the productivity-oriented...

chapter

Redesigning Go’s Built-In Map to Support Concurrent Operations

Louis Jenkins, Tingzhe Zhou, Michael Spear

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) > 14 - 26

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)

The Go language lacks built-in data structures that allow fine-grained concurrent access. In particular, its map data type, one of only two generic collections in Go, limits concurrency to the case where all operations are read-only; any mutation (insert, update, or remove) requires exclusive access to the entire map. The tight integration of this map into the Go language and runtime precludes its...

chapter

A Multi-agent Parallel Approach to Analyzing Large Climate Data Sets

Jason Woodring, Matthew Sell, Munehiro Fukuda, Hazeline Asuncion, more

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) > 1639 - 1648

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

Despite various cloud technologies that have parallelized and scaled up big data analysis, they target data mostly in texts which are easy to partition and thus easy to map over a cluster system. Therefore, their parallelization do not necessarily cover scientific structured data such as NetCDF or need additional, user-provided tools to convert the original data into specific formats. To facilitate...

chapter

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices

Sangkuen Lee, Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) > 619 - 624

2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Processing In Memory (PIM), the concept of integrating processing directly with memory, has been attracting a lot of attention since PIM can assist in overcoming the throughput limitation caused by data movement between CPU and memory. The challenge, however, is that it requires the programmers to have a deep understanding of the PIM architecture to maximize the benefits such as data locality and...

chapter

Towards a GraphBLAS Library in Chapel

Ariful Azad, Aydin Buluc

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1095 - 1104

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

The adoption of a programming language is positively influenced by the breadth of its software libraries. Chapel is a modern andrelatively young parallel programming language. Consequently, not many domain-specific software libraries exists that are written for Chapel. Graph processing is an important domain with many applications in cyber security, energy, social networking, and health. Implementing...

chapter

Automated Dynamic Data Redistribution

Thomas Marrinan, Joseph A. Insley, Silvio Rizzi, Francois Tessier, more

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) > 1208 - 1215

2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

High-performance distributed memory applications often load or receive data in a format that differs from what the application uses. One such difference arises from how the application distributes data for parallel processing. Data must be redistributed from how it was laid out by the producer to how the application needs the data to be laid out amongst its processes. In this paper, we present a large-scale...

chapter

Verifying Concurrent Programs Using Contracts

Ricardo J. Dias, Carla Ferreira, Jan Fiedor, Joao M. Lourenco, more

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST) > 196 - 206

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

The central notion of this paper is that of contracts for concurrency, allowing one to capture the expected atomicity of sequences of method or service calls in a concurrent program. The contracts may be either extracted automatically from the source code, or provided by developers of libraries or software modules to reflect their expected usage in a concurrent setting. We start by extending the so-far...

chapter

dynStruct: An automatic reverse engineering tool for structure recovery and memory use analysis

Daniel Mercier, Aziem Chawdhary, Richard Jones

2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER) > 497 - 501

2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)

dynStruct is an open source structure recovery tool for ×86 binaries. It uses dynamic binary instrumentation to record information about memory accesses, which is then processed off-line to recover structures created and used by the binary. It provides a powerful web interface which not only displays the raw data and the recovered structures but also allows this information to be explored and manually...

chapter

Improving the efficiency of program analysis with symbolic execution

Alexey Fedorov, Vitaliy Kokin, Andrey Andrianov, Alexey Vysochkin

2017 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) > 390 - 393

2017 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)

The most common errors in programs written in C/C++ languages are errors of memory interaction. This work is aimed at developing efficient software tools that provide more efficient detection of memory usage errors that occur during the execution of the program. For these purposes, memory analysis algorithms are discussed in the context of symbolic execution. Conducted experimental research has found...

chapter

Level-Synchronous BFS Algorithm Implemented in Java Using PCJ Library

Magdalena Ryczkowska, Marek Nowicki, Piotr Bala

2016 International Conference on Computational Science and Computational Intelligence (CSCI) > 596 - 601

2016 International Conference on Computational Science and Computational Intelligence (CSCI)

Graph processing is used in many fields of science such as sociology, risk prediction or biology. Although analysis of graphs is important it also poses numerous challenges especially for large graphs which have to be processed on multicore systems. In this paper, we present PGAS (Partitioned Global Address Space) version of the level-synchronous BFS (Breadth First Search) algorithm and its implementation...

chapter

DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms

Karl Fuerlinger, Tobias Fuchs, Roger Kowalewski

2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) > 983 - 990

2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

We present DASH, a C++ template library that offers distributed data structures and parallel algorithms and implements a compiler-free PGAS (partitioned global address space) approach. DASH offers many productivity and performance features such as global-view data structures, efficient support for the owner-computes model, flexible multidimensional data distribution schemes and interoperability with...

chapter

Dask & Numba: Simple libraries for optimizing scientific python code

James Crist

2016 IEEE International Conference on Big Data (Big Data) > 2342 - 2343

2016 IEEE International Conference on Big Data (Big Data)

Python is a high level language that is used by scientists for numeric computations. However, the performance of the language can be a hindrance when scaling to larger data sets, requiring some operations to be rewritten in a lower level language. To address this problem, we propose two libraries to allow numeric Python code to be optimized incrementally, requiring minimal changes. Here we describe...

chapter

SciSpark: Highly interactive in-memory science data analytics

Brian Wilson, Rahul Palamuttam, Kim Whitehall, Chris Mattmann, more

2016 IEEE International Conference on Big Data (Big Data) > 2964 - 2973

2016 IEEE International Conference on Big Data (Big Data)

We present further work on SciSpark, a Big Data framework that extends Apache Spark's inmemory parallel computing to scale scientific computations. SciSpark's current architecture and design includes: time and space partitioning of highresolution geo-grids from NetCDF3/4; a sciDataset class providing N-dimensional array operations in Scala/Java and CF-style variable attributes (an update of our prior...

chapter

A Multi-dimensional Distributed Array Abstraction for PGAS

Tobias Fuchs, Karl Furlinger

2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) > 1061 - 1068

2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

DASH is a realization of the PGAS (partitioned global address space) model in the form of a C++ template library without the need for a custom PGAS (pre-)compiler. We present the DASH NArray concept, a multidimensional array abstraction designed as an underlying container for stencil-and dense numerical applications. After introducing fundamental programming concepts used in DASH, we explain how these...

chapter

Granularity and the Cost of Error Recovery in Resilient AMR Scientific Applications

Anshu Dubey, Hajime Fujita, Daniel T. Graves, Andrew Chien, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 492 - 501

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Supercomputing platforms are expected to have larger failure rates in the future because of scaling and power concerns. The memory and performance impact may vary with error types and failure modes. Therefore, localized recovery schemes will be important for scientific computations, including failure modes where application intervention is suitable for recovery. We present a resiliency methodology...

chapter

DAOS and Friends: A Proposal for an Exascale Storage System

Jay Lofstead, Ivo Jimenez, Carlos Maltzahn, Quincey Koziol, more

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis > 585 - 596

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase two of the project starting, it is an excellent opportunity to explore the complete design and how it will address the needs of extreme scale platforms. This paper examines each layer of the proposed stack...

chapter

Compiler Transformation to Generate Hybrid Sparse Computations

Huihui Zhang, Anand Venkat, Mary Hall

2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3) > 34 - 41

2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3)

Applications over sparse matrices and graphs often rely on efficient matrix representations that exploit the nonzero structure of the sparse representation. In some cases, this structure varies within the matrix, e.g., some portions are more dense and others are very sparse. For such matrices, hybrid algorithms are commonly used in sparse linear algebra and graph libraries, which employ multiple representations...

INFONA - science communication portal

Search results

Automatic Scan Parallelization in OpenMP

Trends in Data Locality Abstractions for HPC Systems

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Integrating productivity-oriented programming languages with high-performance data structures

Redesigning Go’s Built-In Map to Support Concurrent Operations

A Multi-agent Parallel Approach to Analyzing Large Climate Data Sets

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices

Towards a GraphBLAS Library in Chapel

Automated Dynamic Data Redistribution

Verifying Concurrent Programs Using Contracts

dynStruct: An automatic reverse engineering tool for structure recovery and memory use analysis

Improving the efficiency of program analysis with symbolic execution

Level-Synchronous BFS Algorithm Implemented in Java Using PCJ Library

DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms

Dask & Numba: Simple libraries for optimizing scientific python code

SciSpark: Highly interactive in-memory science data analytics

A Multi-dimensional Distributed Array Abstraction for PGAS

Granularity and the Cost of Error Recovery in Resilient AMR Scientific Applications

DAOS and Friends: A Proposal for an Exascale Storage System

Compiler Transformation to Generate Hybrid Sparse Computations

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options