2010 International Conference on High Performance Computing (HiPC 2010)

Items from 1 to 20 out of 43 results

chapter

Reducing capacity and conflict misses using Set Saturation Levels

D Rolan, B B Fraguela, R Doallo

2010 International Conference on High Performance Computing > 1 - 9

2010 International Conference on High Performance Computing (HiPC 2010)

The well-known memory wall problem has motivated wide research in the design of caches. Last-level caches, whose misses can stall the processors for hundreds of cycles, have received particular attention. Strategies to modify adaptably the cache insertion, promotion, eviction and even placement policies have been proposed, some techniques being better at reducing different kinds of misses. For example...

chapter

Fair bandwidth allocation in wireless mobile environment using max-flow

S K Dandapat, B Mitra, N Ganguly, R R Choudhury

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Wireless clients must associate to a specific Access Point (AP) to communicate over the Internet. Current association methods are based on maximum Received Signal Strength Index (RSSI) implying that a client associates to the strongest AP around it. This is a simple scheme that has performed well in purely distributed settings. Modern wireless networks, however, are increasingly being connected by...

chapter

GRS — GPU radix sort for multifield records

S Bandyopadhyay, S Sahni

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

We develop a radix sort algorithm, GRS, suitable to sort multifield records on a graphics processing unit (GPU). We assume the ByField layout for records to be sorted. GRS is benchmarked against the radix sort algorithm, SDK, in NVIDIA's CUDA SDK 3.0 as well as the radix sort algorithm, SRTS, of Merrill and Grimshaw. Although SRTS is faster than both GRS and SDK when sorting numbers as well as records...

chapter

vNUMA-mgr: Managing VM memory on NUMA platforms

D S Rao, K Schwan

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Continuing improvements in the scale of many-core platforms are accompanied by increased asymmetry in their memory architectures. Such NUMA architectures, however, require systems software that understands this asymmetry to attain high levels of performance, leading to significant work in optimizing operating systems like Linux and Windows to increase locality of access to memory nodes and to consider...

chapter

Automated mapping of regular communication graphs on mesh interconnects

Abhinav Bhatelé, G R Gupta, Laxmikant V Kalé, I-Hsin Chung

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Network contention has a significantly adverse effect on the performance of parallel applications with increasing size of parallel machines. Machines of the petascale era are forcing application developers to map tasks intelligently to job partitions to achieve the best performance possible. This paper presents a framework for automated mapping of parallel applications with regular communication graphs...

chapter

A log-based redundant architecture for reliable parallel computation

Daniel Sánchez, Juan L Aragón, José M García

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

CMOS scaling exacerbates hardware errors making reliability a big concern for recent and future microarchitecture designs. Mechanisms to provide fault tolerance in architectures must accomplish several objectives such as low performance degradation, power consumption and area overhead. Several studies have been already proposed to provide fault tolerance for parallel codes. However, these proposals...

chapter

Approaches for parallelizing reductions on modern GPUs

Xin Huo, V T Ravi, Wenjing Ma, G Agrawal

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

GPU hardware and software has been evolving rapidly. CUDA versions 1.1 and higher started supporting atomic operations on device memory, and CUDA versions 1.2 and higher started supporting atomic operations on shared memory. This paper focuses on parallelizing applications involving reductions on GPUs. Prior to the availability of support for locking, these applications could only be parallelized...

chapter

A study of memory-aware scheduling in message driven parallel programs

I Dooley, Chao Mei, J Lifflander, L V Kale

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

This paper presents a simple, but powerful memory-aware scheduling mechanism that adaptively schedules tasks in a message driven distributed-memory parallel program. The scheduler adapts its behavior whenever memory usage exceeds a threshold by scheduling tasks known to reduce memory usage. The usefulness of the scheduler and its low overhead are demonstrated in the context of an LU matrix factorization...

chapter

Balanced stream assignment for service facility

R Garg, L P Shahabuddin, A Verma

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Shared data centers and clouds are gaining popularity because of their ability to reduce costs by increasing the utilization of server farms. In a shared server environment, a careful assignment of workload streams (all work-requests from a customer may constitute a stream) to servers is necessary to ensure good “end user” performance. In this work, we investigate the assignment of streams to servers...

chapter

Diagnosing the root-causes of failures from cluster log files

E Chuah, Shyh-hao Kuo, P Hiew, William-Chandra Tjhi, more

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

System event logs are often the primary source of information for diagnosing (and predicting) the causes of failures for cluster systems. Due to interactions among the system hardware and software components, the system event logs for large cluster systems are comprised of streams of interleaved events, and only a small fraction of the events over a small time span are relevant to the diagnosis of...

chapter

Reducing data center power with server consolidation: Approximation and evaluation

C Subramanian, A Vasan, A Sivasubramaniam

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

With the growing costs of powering data centers, power management is gaining importance. Server consolidation in data centers, enabled by virtualization technologies, is becoming a popular option for organizations to reduce costs and improve manageability. While consolidation offers these benefits, it is important to ensure proper resource provisioning so that performance is not compromised. In addition...

chapter

EMC²: Extending Magny-Cours coherence for large-scale servers

A Ros, B Cuesta, R Fernández-Pascual, María E Gómez, more

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

The demand of larger and more powerful high-performance shared-memory servers is growing over the last few years. To meet this need, AMD has recently launched the twelve-core Magny-Cours processors. They include a directory cache (Probe Filter) that increases the scalability of the coherence protocol applied by Opterons, based on coherent Hyper Transport interconnect (cHT). cHT limits up to 8 the...

chapter

A reliable data transport protocol for partitioned actors in Wireless Sensor and Actor Networks

N Handigol, K Selvaradjou, C S R Murthy

2010 International Conference on High Performance Computing > 1 - 8

2010 International Conference on High Performance Computing (HiPC 2010)

In Wireless Sensor and Actor Networks (WSANs), effective Actor-Actor Communication (AAC) is an important requirement for the timely responses to events reported by the sensors. However, due to scattered nature of events, mobility of actor nodes, and low density of actor nodes, the network of actor nodes tends to get partitioned frequently. To provide effective AAC in such situations, the energy-constrained...

chapter

VCT_lite: Towards an efficient implementation of virtual cut-through switching in on-chip networks

A Roca, J Flieh, F Silla, J Duato

2010 International Conference on High Performance Computing > 1 - 12

2010 International Conference on High Performance Computing (HiPC 2010)

On-chip networks have rapidly emerged as the best interconnection choice for high-core count chip multiprocessors (CMPs) because of the good scalability properties they present. Their fast evolution has been accelerated by the large inheritance from the offchip network domain. Many of the mechanisms and techniques previously developed in that area have been directly applied to the on-chip domain due...

chapter

Impact of colinearity of sensors selected for location estimation

V P Sadaphal, B N Jain

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

We consider estimating the location of a target moving in a 2D plane by combining distance measurements from multiple sensors. Given that available energy in sensors is at a premium, it is desirable that energy be conserved by selecting fewer number of sensors that measure distance and communicate to the central tracker. We propose heuristics on the basis of which a handful of sensors may be selected...

chapter

Dynamic social grouping based routing in a Mobile Ad-Hoc network

R Cabaniss, S Madria, G Rush, A Trotta, more

2010 International Conference on High Performance Computing > 1 - 8

2010 International Conference on High Performance Computing (HiPC 2010)

The patterns of movement used by Mobile Ad-Hoc networks are application specific, in the sense that networks use nodes which travel in different paths. When these nodes are used in experiments involving social patterns, such as wildlife tracking, algorithms which detect and use these patterns can be used to improve routing efficiency. The intent of this paper is to introduce a routing algorithm which...

chapter

An integer programming framework for optimizing shared memory use on GPUs

Wenjing Ma, G Agrawal

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

General purpose computing using GPUs is becoming increasingly popular, because of GPU's extremely favorable performance/price ratio. Besides application development using CUDA, automatic code generation for GPUs is also receiving attention. Like standard processors, GPUs also have a memory hierarchy, which must be carefully optimized for in order to achieve efficient execution. Specifically, modern...

chapter

Efficient Discrete Range Searching primitives on the GPU with applications

J Soman, M K Kumar, K Kothapalli, P J Narayanan

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Graphics processing units provide a large computational power at a very low price which position them as an ubiquitous accelerator. Efficient primitives that can expand the r ange of operations performed on the GPU are thus important. Discrete Range Searching(DRS) is one such primitive with direct applications to string processing, document and text retrieval systems, and least common ancestor queries...

chapter

Avoiding performance fluctuation in cloud storage

Jianzong Wang, P Varman, Changsheng Xie

2010 International Conference on High Performance Computing > 1 - 9

2010 International Conference on High Performance Computing (HiPC 2010)

Cloud computing is an elastic computing model whereby users can lease computing and storage resources on demand from a remote infrastructure. Cloud computing is gaining popularity due to its low cost, high reliability and wide availability. However, a serious impediment to its wider deployment is the relative lack of effective data management services. The relatively slow access to persistent data...

chapter

Parallel Sparse Matrix Vector Multiplication using greedy extraction of boxes

D Brahme, B R Mishra, A Barve

2010 International Conference on High Performance Computing > 1 - 10

2010 International Conference on High Performance Computing (HiPC 2010)

Parallel Sparse Matrix Vector Multiplication (PSpMV) is a compute intensive kernel used in iterative solvers like Conjugate Gradient, GMRES and Lanzcos. Numerous attempts at optimizing this function have been made that require fine tuning of many hardware and software parameters to achieve optimal performance. We attempt to offer a simple framework that involves (i) Employing a greedy algorithm to...

Publication date

Set your own date range

Content availability

Available (42)
None (1)

Keywords

INSTRUCTION SETS (9)
GRAPHICS PROCESSING UNIT (8)
SERVERS (8)
COPROCESSORS (7)
ALGORITHM DESIGN AND ANALYSIS (6)
BANDWIDTH (6)
COMPUTER GRAPHIC EQUIPMENT (6)
MEMORY MANAGEMENT (6)
PARALLEL MACHINES (6)
PROTOCOLS (6)
COMPUTER ARCHITECTURE (5)
MEASUREMENT (5)
POWER DEMAND (5)
REGISTERS (5)
ARRAYS (4)
CLUSTERING ALGORITHMS (4)
COMPUTATIONAL MODELING (4)
DELAY (4)
FAULT TOLERANCE (4)
FAULT TOLERANT SYSTEMS (4)
HARDWARE (4)
MULTIPROCESSING SYSTEMS (4)
OPTIMIZATION (4)
POWER MANAGEMENT (4)
PROPOSALS (4)
RESOURCE MANAGEMENT (4)
RUNTIME (4)
SCHEDULING (4)
SYSTEM RECOVERY (4)
WIRELESS SENSOR NETWORKS (4)
BENCHMARK TESTING (3)
CACHE STORAGE (3)
CLOUD COMPUTING (3)
COMPUTER CENTRES (3)
ENERGY CONSUMPTION (3)
FAULT TOLERANT COMPUTING (3)
GPU (3)
HEURISTIC ALGORITHMS (3)
KERNEL (3)
MAINFRAMES (3)
MEMORY ARCHITECTURE (3)
MESSAGE PASSING (3)
MIDDLEWARE (3)
PARTITIONING ALGORITHMS (3)
PEER-TO-PEER COMPUTING (3)
PERFORMANCE EVALUATION (3)
POWER AWARE COMPUTING (3)
POWER CONSUMPTION (3)
PROGRAM PROCESSORS (3)
RELIABILITY (3)
SHARED MEMORY SYSTEMS (3)
STORAGE MANAGEMENT (3)
SYNCHRONIZATION (3)
TOPOLOGY (3)
APPLICATION PROGRAM INTERFACES (2)
APPROXIMATION ALGORITHMS (2)
APPROXIMATION METHODS (2)
AUTOMATED MAPPING (2)
COMPUTER GRAPHICS (2)
CONJUGATE GRADIENT SOLVER (2)
DATA CENTERS (2)
DATA LOCALITY (2)
DATA MINING (2)
DATA MODELS (2)
DISTRIBUTED MEMORY SYSTEMS (2)
ELECTRONICS PACKAGING (2)
EQUATIONS (2)
GRAPH THEORY (2)
GRAPHICS PROCESSING UNITS (2)
HEATING (2)
HIGH PERFORMANCE COMPUTING (2)
HIGH-SPEED ETHERNET (2)
INDEXES (2)
INTERNET (2)
LAYOUT (2)
LOAD MODELING (2)
MATRIX DECOMPOSITION (2)
METEOROLOGY (2)
MICROPROCESSOR CHIPS (2)
MONITORING (2)
MPI (2)
MULTIPROCESSOR INTERCONNECTION NETWORKS (2)
NETWORK SERVERS (2)
NETWORK TOPOLOGY (2)
OPTIMISATION (2)
PARALLEL ALGORITHMS (2)
PARALLEL ARCHITECTURES (2)
PARALLEL PROGRAMMING (2)
PATTERN CLUSTERING (2)
PEER TO PEER COMPUTING (2)
PEER-TO-PEER (2)
PEER-TO-PEER VIRTUAL ENVIRONMENT (2)
PROCESSOR SCHEDULING (2)
RANDOM ACCESS MEMORY (2)
RESOURCE ALLOCATION (2)
ROUTING (2)
SCALABLE CONNECTIONLESS COMMUNICATION (2)
SENSORS (2)
SUPERCOMPUTER (2)
TELECOMMUNICATION NETWORK ROUTING (2)
more

INFONA - science communication portal

2010 International Conference on High Performance Computing (HiPC 2010)

Reducing capacity and conflict misses using Set Saturation Levels

Fair bandwidth allocation in wireless mobile environment using max-flow

GRS — GPU radix sort for multifield records

vNUMA-mgr: Managing VM memory on NUMA platforms

Automated mapping of regular communication graphs on mesh interconnects

A log-based redundant architecture for reliable parallel computation

Approaches for parallelizing reductions on modern GPUs

A study of memory-aware scheduling in message driven parallel programs

Balanced stream assignment for service facility

Diagnosing the root-causes of failures from cluster log files

Reducing data center power with server consolidation: Approximation and evaluation

EMC²: Extending Magny-Cours coherence for large-scale servers

A reliable data transport protocol for partitioned actors in Wireless Sensor and Actor Networks

VCT_lite: Towards an efficient implementation of virtual cut-through switching in on-chip networks

Impact of colinearity of sensors selected for location estimation

Dynamic social grouping based routing in a Mobile Ad-Hoc network

An integer programming framework for optimizing shared memory use on GPUs

Efficient Discrete Range Searching primitives on the GPU with applications

Avoiding performance fluctuation in cloud storage

Parallel Sparse Matrix Vector Multiplication using greedy extraction of boxes

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2010 International Conference on High Performance Computing (HiPC 2010) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2010 International Conference on High Performance Computing (HiPC 2010)