Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
This paper describes the use of array notation called Parray in refinement of parallel programs concerning array type that separates the physical data layout and logical structure of multi-dimensional data, and the control flow diversion of heterogeneous processor units. A case study on matrix multiplication demonstrates refinement of Parray programs: the code evolves from a simple single CPU-thread...
With the increasing prominence of many-core architectures and decreasing per-core resources on large supercomputers, a number of applications developers are investigating the use of hybrid MPI+threads programming to utilize computational units while sharing memory. An MPI-only model that uses one MPI process per system core is capable of effectively utilizing the processing units, but it fails to...
This paper describes a new reliable transport protocol designed to run on top of a multicast network service for delivery of continuously generated files. The motivation for this work is to support scientific computing Grid applications that require file transfers between geographically distributed data enters. For example, atmospheric research scientists at various universities subscribe to real-time...
The recently released MPI-3.0 standard introduced a process-level shared-memory interface which enables processes within the same node to have direct load/store access to each others' memory. Such an interface allows applications to declare data structures that are shared by multiple MPI processes on the node. In this paper, we study the capabilities and performance implications of using MPI-3.0 shared...
We review the High Performance Computing Enhanced Apache Big Data Stack HPC-ABDS and summarize the capabilities in 21 identified architecture layers. These cover Message and Data Protocols, Distributed Coordination, Security & Privacy, Monitoring, Infrastructure Management, DevOps, Interoperability, File Systems, Cluster & Resource management, Data Transport, File management, NoSQL,...
In IaaS Cloud Computing platforms, elasticity offers to users the possibility to adjust the number of resources tithe current workload, taking into account peak (high activity) and trough (low activity) periods by powering down/up someresources. This elasticity principally consists in dynamically starting/stopping Virtual Machines to increase/reduce the computing capacities. In this paper, we study...
Many-core architecture provides a massively parallel environment with dozens of cores and hundreds of hardware threads. Scientific application programmers are increasingly looking at ways to utilize such large numbers of lightweight cores for various programming models. Efficiently executing these models on massively parallel many-core environments is not easy, however and performance may be degraded...
MapReduce enables parallel and distributed processing of vast amount of data on a cluster of machines. However, such computing paradigm is subject to threats posed by malicious and cheating nodes or compromised user submitted code that could tamper data and computation since users maintain little control as the computation is carried out in a distributed fashion. In this paper, we focus on the analysis...
Heterogeneous platforms integrating different types of processing units (such as multi-core CPUs and GPUs) are in high demand in high performance computing. Existing studies have shown that using heterogeneous platforms can improve application performance and hardware utilization. However, systematic methods to design, implement, and map applications to efficiently use heterogeneous computing resources...
Emerging big data applications comprise rich multi-faceted workflows with both compute-intensive and data-intensive tasks, and intricate communication patterns. While MapReduce is an effective model for data-intensive tasks, the MPI programming model may be better suited for extracting high-performance for compute-intensive tasks. Researchers have recognized this need to employ specialized models...
Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling...
Despite the popularity of the Apache Hadoop system, its success has been limited by issues such as single points of failure, centralized job/task management, and lack of support for programming models other than MapReduce. The next generation of Hadoop, Apache Hadoop YARN, is designed to address these issues. In this paper, we propose YARNsim, a simulation system for Hadoop YARN. YARNsim is based...
We consider the classical First Come First Served / backfilling algorithm which is commonly used in actual batch schedulers. As HPC platforms grow in size and complexity, an interesting question is how to enhance this algorithm in order to improve global performance by reducing the overall amount of communications. In this direction, we are interested in studying the impact of contiguity and locality...
An important problem in discrete graphical models is the maximum a posterior (MAP) inference problem. Recent research has been focusing on the development of parallel MAP inference algorithm, which scales to graphical models of millions of nodes. In this paper, we introduce a parallel implementation of the recently proposed Bethe-ADMM algorithm using Message Passing Interface (MPI), which allows us...
Latency-sensitive multiparty applications involve intensive communication between multiple participating nodes. Relays are usually adopted for matchmaking end hosts, filtering unwanted traffics, bypassing routing outages and so on. Speeding up the relay-communication becomes increasingly important to improve the QoE of clients. Currently, no rigorous guarantees have been made for the latency-optimal...
In this paper we address the problem of network contention between the migration traffic and the VM application traffic for the live migration of co-located Virtual Machines (VMs). When VMs are migrated with pre-copy, they run at the source host during the migration. Therefore the VM applications with predominantly outbound traffic contend with the outgoing migration traffic at the source host. Similarly,...
Many-core architectures such as graphics processing units (GPUs) rely on thread-level parallelism (TLP)to overcome pipeline hazards. Consequently, each core in a many-core processor employs a relatively simple in-order pipeline with limited capability to exploit instruction-level parallelism (ILP). In this paper, we study the ILP impact on the throughput-oriented many-core architecture, including...
Cloud Computing with Virtualization offers attractive flexibility and elasticity to deliver resources by providing a platform for consolidating complex IT resources in a scalable manner. However, efficiently running HPC applications on Cloud Computing systems is still full of challenges. One of the biggest hurdles in building efficient HPC clouds is the unsatisfactory performance offered by underlying...
The demand for parallel I/O performance continues to grow. However, modelling and generating parallel I/O work-loads are challenging for several reasons including the large number of processes, I/O request dependencies and workload scalability. In this paper, we propose the PIONEER, a complete solution to Parallel I/O workload characterization and gEnERation. The core of PIONEER is a proposed generic...
An efficient implementation of the Process Management Interface (PMI) is crucial to enable fast start-up of MPI jobs. We propose three extensions to the PMI specification: 1) a blocking all gather collective (PMIX_Allgather), 2) a non-blocking all gather collective (PMIX_Iallgather), and 3) a non-blocking fence (PMIX_KVS_Ifence). We design and evaluate several PMI implementations to demonstrate how...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.