2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

chapter

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data

Henry M Monti, Ali R Butt, Sudharshan S Vazhkudai

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

Innovative scientific applications and emerging dense data sources are creating a data deluge for high-end computing systems. Processing such large input data typically involves copying (or staging) onto the supercomputer's specialized high-speed storage, scratch space, for sustained high I/O throughput. The current practice of conservatively staging data as early as possible makes the data vulnerable...

chapter

Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support

Jiuxing Liu

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Virtual machine (VM) technologies have made much progress in improving the efficiency of virtualizing CPU and memory. However, achieving high performance for I/O virtualization remains a challenge, especially for high speed networking devices such as 10 Gigabit Ethernet (10GbE) NICs, and commonly used software-based I/O virtualization approaches usually suffer significant performance degradation compared...

chapter

A parallel architecture for meaning comparison

Suneil Mohan, Amitava Biswas, Aalap Tripathy, Jagannath Pannigrahy, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

In this paper we present a fine grained parallel architecture that performs meaning comparison using vector cosine similarity (dot product). Meaning comparison assigns a similarity value to two objects (e.g. text documents) based on how similar their meanings (represented as two vectors) are to each other. The novelty of our design is the fine grained parallelism which is not exploited in available...

chapter

A hybrid Interest Management mechanism for peer-to-peer Networked Virtual Environments

Ke Pan, Wentong Cai, Xueyan Tang, Suiping Zhou, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

An Interest Management (IM) mechanism eliminates irrelevant status updates transmitted in Networked Virtual Environments (NVE). However, IM itself involves both computation and communication overhead, of which the latter is the focus of this paper. Traditionally, there are area-based and cell-based IM mechanisms. This paper proposes a hybrid IM mechanism for peer-to-peer NVEs, that utilizes the cell-based...

chapter

On-line detection of large-scale parallel application's structure

German Llort, Juan Gonzalez, Harald Servat, Judit Gimenez, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

With larger and larger systems being constantly deployed, trace-based performance analysis of parallel applications has become a challenging task. Even if the amount of performance data gathered per single process is small, traces rapidly become unmanageable when merging together the information collected from all processes. In general, an efficient analysis of such a large volume of data is subject...

chapter

Parallel computation of best connections in public transportation networks

Daniel Delling, Bastian Katz, Thomas Pajor

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Exploiting parallelism in route planning algorithms is a challenging algorithmic problem with obvious applications in mobile navigation and timetable information systems. In this work, we present a novel algorithm for the so-called one-to-all profile-search problem in public transportation networks. It answers the question for all fastest connections between a given station S and any other station...

chapter

A low cost split-issue technique to improve performance of SMT clustered VLIW processors

Manoj Gupta, Fermin Sanchez, Josep Llosa

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Very Long Instruction Word (VLIW) processors are a popular choice in embedded domain due to their hardware simplicity, low cost and low power consumption. Simultaneous MultiThreading (SMT) is a popular technique for improving processor performance. To maintain execution semantics, a VLIW instruction needs to be issued in entirety, which restricts the opportunities in SMT. Split-issue at operation-level...

chapter

Attack-resistant frequency counting

Bo Wu, Jared Saia, Valerie King

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

We present collaborative peer-to-peer algorithms for the problem of approximating frequency counts for popular items distributed across the peers of a large-scale network. Our algorithms are attack-resistant in the sense that they function correctly even in the case where an adaptive and computationally unbounded adversary causes up to a 1/3 fraction of the peers in the network to suffer Byzantine...

chapter

Adapting communication-avoiding LU and QR factorizations to multicore architectures

Simplice Donfack, Laura Grigori, Alok Kumar Gupta

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

In this paper we study algorithms for performing the LU and QR factorizations of dense matrices. Recently, two communication optimal algorithms have been introduced for distributed memory architectures, referred to as communication avoiding CALU and CAQR. In this paper we discuss two algorithms based on CAQR and CALU that are adapted to multicore architectures. They combine ideas to reduce communication...

chapter

Varying bandwidth resource allocation problem with bag constraints

Venkatesan T Chakaravarthy, Vinayaka Pandit, Yogish Sabharwal, Deva P Seetharam

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

We consider the problem of scheduling jobs on a pool of machines. Each job requires multiple machines on which it executes in parallel. For each job, the input specifies release time, deadline, processing time, profit and the number of machines required. The total number of machines may be different at different points of time. A feasible solution is a subset of jobs and a schedule for them such that...

chapter

QoS aware BiNoC architecture

Shih-Hsin Lo, Ying-Cherng Lan, Hsin-Hsien Yeh, Wen-Chung Tsai, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

A quality-of-service (QoS) aware, bi-directional channel NoC (BiNoC) architecture is proposed to support guarantee-service (GS) traffic while reducing packet delivery latency. By incorporating dynamically self-reconfigured bidirectional communication channels between adjacent routers, BiNoC architecture promises more flexibility for various traffic flow patterns. A novel inter-router communication...

chapter

Dynamic fractional resource scheduling for HPC workloads

Mark Stillwell, Frederic Vivien, Henri Casanova

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

We propose a novel job scheduling approach for homogeneous cluster computing platforms. Its key feature is the use of virtual machine technology for sharing resources in a precise and controlled manner. We justify our approach and propose several job scheduling algorithms. We present results obtained in simulations for synthetic and real-world High Performance Computing (HPC) workloads, in which we...

chapter

Improving the performance of hypervisor-based fault tolerance

Jun Zhu, Wei Dong, Zhefu Jiang, Xiaogang Shi, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Hypervisor-based fault tolerance (HBFT), a checkpoint-recovery mechanism, is an emerging approach to sustaining mission-critical applications. Based on virtualization technology, HBFT provides an economic and transparent solution. However, the advantages currently come at the cost of substantial overhead during failure-free, especially for memory intensive applications. This paper presents an in-depth...

chapter

Engineering a scalable high quality graph partitioner

Manuel Holtgrewe, Peter Sanders, Christian Schulz

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

We describe an approach to parallel graph partitioning that scales to hundreds of processors and produces a high solution quality. For example, for many instances from Walshaw's benchmark collection we improve the best known partitioning. We use the well known framework of multi-level graph partitioning. All components are implemented by scalable parallel algorithms. Quality improvements compared...

chapter

Algorithmic Cholesky factorization fault recovery

Doug Hakkarinen, Zizhong Chen

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 10

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Modeling and analysis of large scale scientific systems often use linear least squares regression, frequently employing Cholesky factorization to solve the resulting set of linear equations. With large matrices, this often will be performed in high performance clusters containing many processors. Assuming a constant failure rate per processor, the probability of a failure occurring during the execution...

chapter

An auto-tuning framework for parallel multicore stencil computations

Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Although stencil auto-tuning has shown tremendous potential in effectively utilizing architectural resources, it has hitherto been limited to single kernel instantiations; in addition, the large variety of stencil kernels used in practice makes this computation pattern difficult to assemble into a library. This work presents a stencil auto-tuning framework that significantly advances programmer productivity...

chapter

Decentralized resource management for multi-core desktop grids

Jaehwan Lee, Pete Keleher, Alan Sussman

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 11

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

The majority of CPUs now sold contain multiple computing cores. However, current desktop grid computing systems either ignore the multiplicity of cores, or treat them as distinct, independent machines. The latter approach ignores the resource contention present between cores in a single CPU, while the former approach fails to take advantage of significant computing power. We propose a decentralized...

chapter

Power-aware resource provisioning in cluster computing

Kaiqi Xiong

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 11

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

The high power consumption of cluster computing infrastructures has become a major concern. It leads to the increased heat dissipation and decreased reliability of cluster servers. Power management becomes a critical issue in cluster computing. In this paper, we start with an analysis of the relationship between cluster performance and power consumption. We study both the problem of minimizing the...

chapter

DEBAR: A scalable high-performance de-duplication storage system for backup and archiving

Tianming Yang, Hong Jiang, Dan Feng, Zhongying Niu, more

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Driven by the increasing demand for large-scale and high-performance data protection, disk-based de-duplication storage has become a new research focus of the storage industry and research community where several new schemes have emerged recently. So far these systems are mainly inline de-duplication approaches, which are centralized and do not lend themselves easily to be extended to handle global...

chapter

Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P

Wei Tang, Narayan Desai, Daniel Buettner, Zhiling Lan

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 11

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Backfilling and short-job-first are widely acknowledged enhancements to the simple but popular first-come, first-served job scheduling policy. However, both enhancements depend on user-provided estimates of job runtime, which research has repeatedly shown to be inaccurate. We have investigated the effects of this inaccuracy on backfilling and different queue prioritization policies, determining which...

INFONA - science communication portal

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data

Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support

A parallel architecture for meaning comparison

A hybrid Interest Management mechanism for peer-to-peer Networked Virtual Environments

On-line detection of large-scale parallel application's structure

Parallel computation of best connections in public transportation networks

A low cost split-issue technique to improve performance of SMT clustered VLIW processors

Attack-resistant frequency counting

Adapting communication-avoiding LU and QR factorizations to multicore architectures

Varying bandwidth resource allocation problem with bag constraints

QoS aware BiNoC architecture

Dynamic fractional resource scheduling for HPC workloads

Improving the performance of hypervisor-based fault tolerance

Engineering a scalable high quality graph partitioner

Algorithmic Cholesky factorization fault recovery

An auto-tuning framework for parallel multicore stencil computations

Decentralized resource management for multi-core desktop grids

Power-aware resource provisioning in cluster computing

DEBAR: A scalable high-performance de-duplication storage system for backup and archiving

Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)