2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

rozdział

A Complex Event Processing Toolkit for Detecting Technical Chart Patterns

Madhushi Niluka Bandara, Rajitha Madhushan Ranasinghe, Rashmi Woranga Mudugamuwa Arachchi, Channa Gayan Somathilaka, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 547 - 556

With the advent of large high volume data, we have seen need for real time analytic techniques like Complex Event Processing. This paper extends a Complex Event Processing Engine to support real time identification of technical chart patterns from streaming data. Technical chart patterns are known interesting recurring patterns on time series data, and they are used by experts in time series data...

rozdział

Towards Detecting Patterns in Failure Logs of Large-Scale Distributed Systems

Nentawe Gurumdimma, Arshad Jhumka, Maria Liakata, Edward Chuah, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 1052 - 1061

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

The ability to automatically detect faults or fault patterns to enhance system reliability is important for system administrators in reducing system failures. To achieve this objective, the message logs from cluster system are augmented with failure information, i.e., The raw log data is labelled. However, tagging or labelling of raw log data is very costly. In this paper, our objective is to detect...

rozdział

On the Impact of Execution Models: A Case Study in Computational Chemistry

Daniel Chavarria-Miranda, Mahantesh Halappanavar, Sriram Krishnamoorthy, Joseph Manzano, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 255 - 264

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore...

rozdział

Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs

Nishant Saurabh, Ana Lucia Varbanescu, Gyan Ranjan

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 265 - 274

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Many applications in network analysis require the computation of the network's laplacian pseudo-inverse - e.g., Topological centrality in social networks or estimating commute times in electrical networks. As large graphs become ubiquitous, the traditional approaches - with quadratic or cubic complexity in the number of vertices - do not scale. To alleviate this performance issue, a divide-and-conquer...

rozdział

Understanding Performance Portability of OpenACC for Supercomputers

Suttinee Sawadsitang, James Lin, Simon See, Francois Bodin, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 699 - 707

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Scientific applications need to be moved among supercomputers, such as Tianhe-2 and TSUBAME 2.5. OpenACC provides a directive-based approach for a single source code base with function portability across different accelerators used in the supercomputers. However, the performance portability is not guaranteed by the OpenACC standard. Therefore, we propose a systematic optimization method, instead of...

rozdział

Combining Backward and Forward Recovery to Cope with Silent Errors in Iterative Solvers

Massimiliano Fasi, Yves Robert, Bora Ucar

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 980 - 989

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Several recent papers have introduced a periodic verification mechanism to detect silent errors in iterative solvers. Chen [PPoPP'13, pp. 167 -- 176] has shown how to combine such a verification mechanism (a stability test checking the orthogonality of two vectors and recomputing the residual) with check pointing: the idea is to verify every d iterations, and to checkpoint every c × d iterations....

rozdział

GPU-based Parallel R-tree Construction and Querying

Sushil K. Prasad, Michael McDermott, Xi He, Satish Puri

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 618 - 627

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

An R-tree is a data structure for organizing and querying multi-dimensional non-uniform and overlapping data. Efficient parallelization of R-tree is an important problem due to societal applications such as geographic information systems (GIS), spatial database management systems, and VLSI layout which employ R-trees for spatial analysis tasks such as map-overlay. As graphics processing units (GPUs)...

rozdział

Directive-Based Auto-Tuning for the Finite Difference Method on the Xeon Phi

Takahiro Katagiri, Satoshi Ohshima, Masaharu Matsumoto

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 1221 - 1230

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

In this paper, we present a directive-based auto-tuning (AT) framework, called ppOpen-AT, and demonstrate its effect using simulation code based on the Finite Difference Method (FDM). The framework utilizes well-known loop transformation techniques. However, the codes used are carefully designed to minimize the software stack in order to meet the requirements of a many-core architecture currently...

rozdział

Heterogeneous Habanero-C (H2C): A Portable Programming Model for Heterogeneous Processors

Deepak Majeti, Vivek Sarkar

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 708 - 717

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Heterogeneous architectures with their diverse architectural features impose significant programmability challenges. Existing programming systems involve non-trivial learning and are not productive, not portable, and are challenging to tune for performance. In this paper, we introduce Heterogeneous Habanero-C (H2C), which is an implementation of the Habanero execution model for modern heterogeneous...

rozdział

Performance Portable Applications for Hardware Accelerators: Lessons Learned from SPEC ACCEL

Guido Juckeland, Alexander Grund, Wolfgang E. Nagel

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 689 - 698

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

The popular and diverse hardware accelerator ecosystem makes apples-to-apples comparisons between platforms rather difficult. SPEC ACCEL tries to offer a yardstick to compare different accelerator hardware and software ecosystems. This paper uses this SPEC benchmark to compare an AMD GPU, an NVIDIA GPU and an Intel Xeon Phi with respect to performance and energy consumption. It also provides observations...

rozdział

Fast Sparse Matrix and Sparse Vector Multiplication Algorithm on the GPU

Carl Yang, Yangzihao Wang, John D. Owens

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 841 - 847

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

We implement a promising algorithm for sparse-matrix sparse-vector multiplication (SpMSpV) on the GPU. An efficient k-way merge lies at the heart of finding a fast parallel SpMSpV algorithm. We examine the scalability of three approaches -- no sorting, merge sorting, and radix sorting -- in solving this problem. For breadth-first search (BFS), we achieve a 1.26x speedup over state-of-the-art sparse-matrix...

rozdział

Energy Modeling and Optimization for Tiled Nested-Loop Codes

Nirmal Prajapati, Waruna Ranasinghe, Vamshi Tandrapati, Rumen Andonov, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 888 - 895

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

We develop a methodology for modeling the energy efficiency of tiled nested-loop codes running on a graphics processing unit (GPU) and use it for energy efficiency optimization. % We use the polyhedral model, a We assume that a highly optimized and parametrized version of a tiled nested -- loop code, either written by an expert programmer or automatically produced by a polyhedral compilation tool...

rozdział

A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC

Martin Kong, Louis-Noel Pouchet, Ponnuswamy Sadayappan

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 1241 - 1250

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

In this paper we show how to analytically model two widely used distributed matrix-multiply algorithms, Cannon's 2D and Johnson's 3D, implemented within the Intel Concurrent Collections framework for shared/distributed memory execution. Our precise analytical model proceeds by estimating the computation time and communication times, taking into account factors such as the block size, communication...

rozdział

Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems

Tomohiro Suzuki

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 1214 - 1220

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Tile algorithms for matrix decomposition can generate many fine-grained tasks. Therefore, their suitability for processing with multicourse architecture has attracted much attention from the high-performance computing (HPC) community. Our implementation of tile QR decomposition for a cluster system has dynamic scheduling, OpenMP work- sharing, and other useful features. In this article, we discuss...

rozdział

Estimation of Non-functional Properties for Embedded Hardware with Application to Image Processing

Christian Herglotz, Jurgen Seiler, Andre Kaup, Arne Hendricks, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 190 - 195

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

In recent years, due to a higher demand for portable devices, which provide restricted amounts of processing capacity and battery power, the need for energy and time efficient hard and software solutions has increased. Preliminary estimations of time and energy consumption can thus be valuable to improve implementations and design decisions. To this end, this paper presents a method to estimate the...

rozdział

Machine Learning Based Auto-Tuning for Enhanced OpenCL Performance Portability

Thomas L. Falch, Anne C. Elster

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 1231 - 1240

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Heterogeneous computing, which combines devices with different architectures, is rising in popularity, and promises increased performance combined with reduced energy consumption. OpenCL has been proposed as a standard for programing such systems, and offers functional portability. It does, however, suffer from poor performance portability, code tuned for one device must be re-tuned to achieve good...

rozdział

Streamlining Whole Function Vectorization in C Using Higher Order Vector Semantics

Gil Rapaport, Ayal Zaks, Yosi Ben-Asher

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 718 - 727

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Taking full advantage of SIMD instructions in C programs still requires tedious and non-portable programming using intrinsics, despite considerable efforts spent developing auto-vectorization capabilities in recent decades. Whole Function Vectorization (WFV) is a recent technique for extending the use of SIMD across entire functions. WFV has so far only been used in data-parallel languages such as...

rozdział

Folding Methods for Event Timelines in Performance Analysis

Matthias Weber, Ronald Geisler, Holger Brunst, Wolfgang E. Nagel

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 205 - 214

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

The complexity of today's high performance computing systems, and their parallel software, requires performance analysis tools to fully understand application performance behavior. The visualization of event streams has proven to be a powerful approach for the detection of various types of performance problems. However, visualization of large numbers of process streams quickly hits the limits of available...

rozdział

Graphulo: Linear Algebra Graph Kernels for NoSQL Databases

Vijay Gadepally, Jake Bolewski, Dan Hook, Dylan Hutchison, więcej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 822 - 830

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Big data and the Internet of Things era continue to challenge computational systems. Several technology solutions such as NoSQL databases have been developed to deal with this challenge. In order to generate meaningful results from large datasets, analysts often use a graph representation which provides an intuitive way to work with the data. Graph vertices can represent users and events, and edges...

rozdział

Towards a Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUs

Sina Meraji, John Keenleyside, Sunil Kamath, Bob Blainey

2015 IEEE International Parallel and Distributed Processing Symposium Workshop > 594 - 603

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

Column-store in-memory databases have received a lot of attention because of their fast query processing response times on modern multi-core machines. Among different database operations, group by/aggregate is an important and potentially costly operation. Moreover, sort-based and hash-based algorithms are the most common ways of processing group by/aggregate queries. While sort-based algorithms are...

INFONA - portal komunikacji naukowej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)

A Complex Event Processing Toolkit for Detecting Technical Chart Patterns

Towards Detecting Patterns in Failure Logs of Large-Scale Distributed Systems

On the Impact of Execution Models: A Case Study in Computational Chemistry

Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs

Understanding Performance Portability of OpenACC for Supercomputers

Combining Backward and Forward Recovery to Cope with Silent Errors in Iterative Solvers

GPU-based Parallel R-tree Construction and Querying

Directive-Based Auto-Tuning for the Finite Difference Method on the Xeon Phi

Heterogeneous Habanero-C (H2C): A Portable Programming Model for Heterogeneous Processors

Performance Portable Applications for Hardware Accelerators: Lessons Learned from SPEC ACCEL

Fast Sparse Matrix and Sparse Vector Multiplication Algorithm on the GPU

Energy Modeling and Optimization for Tiled Nested-Loop Codes

A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC

Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems

Estimation of Non-functional Properties for Embedded Hardware with Application to Image Processing

Machine Learning Based Auto-Tuning for Enhanced OpenCL Performance Portability

Streamlining Whole Function Vectorization in C Using Higher Order Vector Semantics

Folding Methods for Event Timelines in Performance Analysis

Graphulo: Linear Algebra Graph Kernels for NoSQL Databases

Towards a Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUs

Opcje filtrowania

Data publikacji

Słowa kluczowe

INFONA - portal komunikacji naukowej

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) $("#expandableTitles").expandable();

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu

2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW)