Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
The digital revolution alongside with the evolution of social networking and widespread of personal and mobile devices enabled several new avenues. In particular, crowdsourcing raised and attracted interests on an innovative approach of performing tedious and repetitive work by outsourcing them to a wide population of people, a crowd. In this paper, such new trends are discussed by analyzing some...
Nowadays, Graphic Processing Units (GPUs) have become popular as general-purpose processors; they have been used as co-processors with CPUs forming heterogeneous systems. CPUs and GPUs have different execution capabilities, energy consumption and thermal characteristics. Typically, the role of the GPU is to execute the parallel parts of the job and the role of the CPU (i.e., host) is to execute the...
With the growing significance of green computing and difficulty in obtaining accurate real time power measurements, there is an increasing need for accurate and reliable power estimation techniques for energy-aware performance optimization. In this paper we present a statistical approach for building accurate power models using Performance Monitoring Counters (PMC) as effective proxies for x86 systems...
We present a unifying approach to monitoring and analyzing various metrics crucial in understanding the operational characteristics at different levels of HPC systems. Increase in the performance of HPC-scale processors has been closely followed by an increase in the power draw of the processors and the scale of HPC systems. Consequently, the relationship between the thermal and power characteristics...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
In this paper we propose a low-overhead optimizer for the ubiquitous sparse matrix-vector multiplication (SpMV) kernel on the Intel Xeon Phi manycore processor. The architectural differences of such processors compared to their multicore counterparts overly expose inherent structural weaknesses of different sparse matrices, intensifying performance issues beyond the traditionally reported memory bandwidth...
Programmers use begin constructs in Chapel to create fire and forget-style tasks, which do not perform any implicit synchronization with the parent task. While this provides a good facility to invoke parallel tasks, it poses issues when the child task accesses a variable declared in the scope of its ancestor. If the parent task exits before the child, its scope is deallocated and the child may end...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
In the recent years the search for scalability in terms of computing power has led to very complex parallel computer architectures which require greater control of the storage and computation resources to utilize all the available hardware capacity for optimal performance. New solutions in the level of programming languages/models have increased the reliance and need for threads. A system with a huge...
Stencil computations are not well optimized by general-purpose production compilers and the increased use of multicore, manycore, and accelerator-based systems makes the optimization problem even more challenging. In this paper we present Snowflake, a Domain Specific Language (DSL) for stencils that uses a "micro-compiler" approach, i.e., small, focused, domain-specific code generators....
We introduce a new network structure named a (S, T)-maximal directed acyclic graph (DAG). A (S, T)-maximal DAG is a mixed graph which allows both directed edges and undirected edges. It is constructed, for any given connected undirected network with a set of S nodes specified as source nodes and a set of T nodes specified as sink nodes, by assigning directions to as many undirected edges as possible...
Provides an abstract of the keynote presentation and a brief professional biography of the presenter. The complete presentation was not made available for publication as part of the conference proceedings.
In this paper, we present our work to enable optimized one-sided communication operations on the ARM v8 architecture using a high-performance InfiniBand network interconnect, as well as an evaluation of our implementation. For this study, we started with an OpenSHMEM implementation based on Open MPI/SHMEM, and combined it with the UCX framework and the XPMEM kernel extension for shared memory communication...
We consider the distributed setting of N autonomous mobile agents that operate in Look-Compute-Move (LCM) cycles and communicate with other agents using colored lights (the agents with lights model). We study the fundamental COMPLETE VISIBILITY problem of repositioning N agents on a plane so that each agent is visible to all others. We assume obstructed visibility under which an agent cannot see another...
Arvo is a new programming language focuses on concurrency. Its primary goal is to provide the programmer with an simple and concise way to design concurrent systems without explicitly identifying and differentiating concurrent and sequential sections. It does this by preventing the programmer from being able to explicitly define an order between statements or expressions. Thus Arvo conceptually launches...
Alternating least squares (ALS) has been proved to be an effective solver of matrix factorization for recommender systems. To speedup factorizing performance, various parallel ALS solvers have been proposed to leverage modern multi-core CPUs and many-core GPUs/MICs. Existing implementations are limited in either speed or portability (constrained to certain platforms). In this paper, we present an...
We focus on sorting, which is the building block of many machine learning algorithms, and propose a novel distributed sorting algorithm, named CodedTeraSort, which substantially improves the execution time of the TeraSort benchmark in Hadoop MapReduce. The key idea of CodedTeraSort is to impose structured redundancy in data, in order to enable in-network coding opportunities that overcome the data...
Heterogeneous processing has gained popularity in the high performancecomputing (HPC) area lately and it appears to have a great potential for future data centers. In this regard, accelerators, such as GPUs and Intel Xeon Phi, have already started to play a significant role in HPC systems offering a high degree of parallelism to application developers. Furthermore, hardware virtualization is gaining...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.