18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

Items from 1 to 5 out of 5 results

chapter

hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications

Franois Broquedis, Jerome Clet-Ortega, Stephanie Moreaud, Nathalie Furmento, more

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing > 180 - 186

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefully adapt their placement and behavior according to the underlying hierarchy of hardware resources and their software affinities. We introduce the Hardware Locality (hwloc) software which gathers hardware information about...

chapter

Lessons Learnt Porting Parallelisation Techniques for Irregular Codes to NUMA Systems

Juan A Lorenzo, Juan C Pichel, David LaFrance-Linden, Francisco F Rivera, more

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing > 213 - 217

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

This work presents a study undertaken to characterise the behaviour of some parallelisation techniques for irregular codes, previously developed for SMP architectures, on a several-node SMP NUMA system. The main objective is to determine the performance effect of bus contention and cache coherency in such a complex architecture. Results show that: (1) cores which share a socket can be considered as...

chapter

Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors

Pedro Alonso, Ravi Reddy, Alexey Lastovetsky

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing > 263 - 270

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore Processors (HCoMs). These strategies are called Heterogeneous Process Distribution Strategy (HPS) and Heterogeneous Data Distribution Strategy (HDS). They are not novel and have been researched thoroughly. However, the advent...

chapter

On the Efficient Implementation of Reductions on the Cell Broadband Engine

Alfred Strey

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing > 223 - 228

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

For a high-performance parallel implementation of many scientific algorithms, efficient realizations of combining communication patterns like reduce or all-reduce are important. Especially on the Cell Broadband Engine a low latency realization of such operations is not obvious. So in this paper several algorithms for implementing reductions are discussed and efficient implementations on the Cell are...

chapter

Experimenting Iterative Computations with Ordered Read-Write Locks

Pierre-Nicolas Clauss, Jens Gustedt

2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing > 155 - 162

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)

This paper presents the first experimental results of the use of our new adaptive tool for synchronization, based on ordered read-write locks, ORWL. They provide a new synchronizing method for data-oriented parallel algorithms and are particularly suited for iterative pipelined algorithms with out-of-core data. We conducted experiments with the classic benchmarking Livermore Kernel 23 algorithm to...

Filter options

Content availability:
Available
Keywords:
CONCURRENT COMPUTING

Publication date

Set your own date range

Keywords

DISTRIBUTED COMPUTING (4)
COMPUTER ARCHITECTURE (3)
COMPUTER NETWORKS (3)
MULTICORE PROCESSING (3)
APPLICATION PROGRAM INTERFACES (2)
BANDWIDTH (2)
COMPUTER SCIENCE (2)
MULTIPROCESSING SYSTEMS (2)
PARALLEL ALGORITHMS (2)
PARALLEL ARCHITECTURES (2)
SYNCHRONIZATION (2)
ADAPTIVE TOOL (1)
APPLICATION SOFTWARE (1)
BLADES (1)
CACHE COHERENCY EFFECTS (1)
CACHE STORAGE (1)
CELL BLADE (1)
CELL BROADBAND ENGINE (1)
CLASSIC BENCHMARKING LIVERMORE KERNEL ALGORITHM (1)
CLOCKS (1)
CLUSTERING ALGORITHMS (1)
COMMUNICATION PATTERNS (1)
COMPLEX ARCHITECTURE (1)
COMPLEX HARDWARE TOPOLOGY (1)
DATA-ORIENTED PARALLEL ALGORITHMS (1)
DISTRIBUTION STRATEGIES (1)
DISTRIBUTION STRATEGY (1)
ENGINES (1)
EXPERIMENTS (1)
HARDWARE AFFINITIES MANAGEMENT (1)
HARDWARE COUNTERS (1)
HARDWARE LOCALITY SOFTWARE (1)
HARDWARE TOPOLOGY AFFINITIES PLACEMENT MPI OPENMP (1)
HETEROGENEOUS COMPUTATIONAL CLUSTERS (1)
HETEROGENEOUS DATA DISTRIBUTION STRATEGY (1)
HETEROGENEOUS PROCESS DISTRIBUTION STRATEGY (1)
HETEROGENEOUS SCALAPACK (1)
HETEROGENOUS CLUSTERS (1)
HETEROMPI (1)
HIGH-PERFORMANCE COMPUTING (1)
HWLOC (1)
INDEPENDENT PROCESSORS (1)
INFORMATICS (1)
INTERCONNECTED CELL PROCESSORS (1)
IRREGULAR CODES (1)
ITANIUM2 (1)
ITERATIVE ALGORITHMS (1)
ITERATIVE COMPUTATIONS (1)
ITERATIVE METHODS (1)
ITERATIVE PIPELINED ALGORITHMS (1)
LINEAR ALGEBRA (1)
LINEAR ALGEBRA PROBLEMS (1)
MATRIX MULTIPLICATION (1)
MATRIX-MATRIX MULTIPLICATION (1)
MEMORY ALLOCATION (1)
MEMORY MANAGEMENT (1)
MEMORY NODES (1)
MESSAGE PASSING (1)
MPI SOFTWARE (1)
MULTI-THREADING (1)
MULTICORE CLUSTERS (1)
MULTICORE PROCESSOR (1)
MULTICORE PROCESSORS (1)
NUMA SYSTEMS (1)
OPENMP THREAD SCHEDULING (1)
ORDERED READ-WRITE LOCKS (1)
ORWL (1)
PARALLEL MATRIX MULTIPLICATION (1)
PARALLEL MATRIX-MATRIX MULTIPLICATION (1)
PARALLEL POSIX THREADS (1)
PARALLEL PROCESSING (1)
PARALLELISATION TECHNIQUES (1)
PARTICLE MEASUREMENTS (1)
PARXXL LIBRARY (1)
PATTERN CLUSTERING (1)
PERVASIVE COMPUTING (1)
PIPELINE PROCESSING (1)
READ-WRITE LOCKS (1)
READ-WRITE MEMORY (1)
REDUCTIONS (1)
REDUCTIONS ALGORITHMS (1)
RUNTIME SYSTEM (1)
SCHEDULING (1)
SHARED CACHES (1)
SMP ARCHITECTURES (1)
SOCKETS (1)
SOFTWARE AFFINITY (1)
SOFTWARE LIBRARIES (1)
SPARSE MATRICES (1)
STORAGE ALLOCATION (1)
SYNCHRONISATION (1)
TESTING (1)
THREAD-TO-CORE MAPPINGS (1)
more

INFONA - science communication portal

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010) $("#expandableTitles").expandable();

hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications

Lessons Learnt Porting Parallelisation Techniques for Irregular Codes to NUMA Systems

Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors

On the Efficient Implementation of Reductions on the Cell Broadband Engine

Experimenting Iterative Computations with Ordered Read-Write Locks

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010)