Languages and Compilers for Parallel Computing

chapter

Fine-grain scheduling under resource constraints

Paul Feautrier

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 1-15

Many present-day microprocessors have fine grain parallelism, be it in the form of a pipeline, of multiple functional units, or replicated processors. The efficient use of such architectures depends on the capability of the compiler to schedule the execution of the object code in such a way that most of the available hardware is in use while complying with data dependences. In the case of one simple...

chapter

Mutation scheduling: A unified approach to compiling for fine-grain parallelism

Steven Novack, Alexandru Nicolau

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 16-30

Trade-offs between code selection, register allocation, and instruction scheduling are inherently interdependent, especially when compiling for fine-grain parallel architectures. However, the conventional approach to compiling for such machines arbitrarily separates these phases so that decisions made during any one phase place unnecessary constraints on the remaining phases. Mutation Scheduling attempts...

chapter

Compiler techniques for fine-grain execution on workstation clusters using PAPERS

H. G. Dietz, W. E. Cohen, T. Muhammad, T. I. Mattox

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 31-45

Just a few years ago, parallel computers were tightly-coupled SIMD, VLIW, or MIMD machines. Now, they are clusters of workstations connected by communication networks yielding ever-higher bandwidth (e.g., Ethernet, FDDI, HiPPI, ATM). For these clusters, compiler research is centered on techniques for hiding huge synchronization and communication latencies, etc. — in general, trying to make parallel...

chapter

Solving alignment using elementary linear algebra

David Bau, Induprakas Kodukula, Vladimir Kotlyar, Keshav Pingali, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 46-60

Data and computation alignment is an important part of compiling sequential programs to architectures with non-uniform memory access times. In this paper, we show that elementary matrix methods can be used to determine communication-free alignment of code and data. We also solve the problem of replicating read-only data to eliminate communication. Our matrix-based approach leads to algorithms which...

chapter

Detecting and using affinity in an automatic data distribution tool

Eduard Ayguadé, Jordi Garcia, Mercè Gironés, Jesús Labarta, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 61-75

This paper describes some aspects of the implementation of our Data Distribution Tool (DDT), which accepts programs written in Fortran77 and obtains alignment and distribution HPF directives for the arrays used in the program. In particular, we describe the phases of the tool which analyze reference patterns in loops, record preferences for alignment and obtain the alignment functions. These functions...

chapter

Array distribution in data-parallel programs

Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, Thomas J. Sheffler

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 76-91

We consider distribution at compile time of the array data in a distributed-memory implementation of a data-parallel program written in a language like Fortran 90. We allow dynamic redistribution of data and define a heuristic algorithmic framework that chooses distribution parameters to minimize an estimate of program completion time. We represent the program as an alignment-distribution graph. We...

chapter

Communication-free parallelization via affine transformations

Amy W. Lim, Monica S. Lam

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 92-106

The paper describes a parallelization algorithm for programs consisting of arbitrary nestings of loops and sequences of loops. The code produced by our algorithm yields all the degrees of communication-free parallelism that can be obtained via loop fission, fusion, interchange, reversal, skewing, scaling, reindexing and statement reordering. The algorithm first assigns the iterations of instructions...

chapter

Finding legal reordering transformations using mappings

Wayne Kelly, William Pugh

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 107-124

We present a unified framework for applying iteration reordering transformations. This framework is able to represent traditional transformations such as loop interchange, loop skewing and loop distribution as well as compositions of these transformations. Using a unified framework rather than a sequence of adhoc transformations makes it easier to analyze and predict the effects of these transformations...

chapter

A new algorithm for global optimization for parallelism and locality

Bill Appelbe, Srinivas Doddapaneni, Charles Hardnett

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 125-140

Converting sequential programs to execute on parallel computers is difficult because of the need to globally optimize for both parallelism and data locality. The choice of which loop nests to parallelize, and how, drastically affects data locality. Similarly, data distribution directives, such as DISTRIBUTE in High Performance Fortran (HPF), affects available parallelism and locality. What is needed...

chapter

Polaris: Improving the effectiveness of parallelizing compilers

William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 141-154

It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels, they often fail to extract any meaningful parallelism from large applications. After a study of application codes, it was concluded that by adding a few new techniques to current compilers, automatic parallelization...

chapter

A formal approach to the compilation of data-parallel languages

J. A. Trescher, L. C. Breebaart, P. F. G. Dechering, A. B. Poelman, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 155-169

In this paper we describe an approach to the compilation of data-parallel programming languages based on a formally defined intermediate language, called V-cal. The calculus V-cal was designed to represent the semantics of data management and control primitives found in data-parallel languages and allows to describe program transformations and optimizations as semantics preserving rewrite rules. ...

chapter

The data partitioning graph: Extending data and control dependencies for data partitioning

Tsuneo Nakanishi, Kazuki Joe, Hideki Saito, Constantine D. Polychronopoulos, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 170-185

Scalability and cost considerations suggest that distributed and distributed shared memory parallel computers will dominate future parallel architectures. These machines could not be used effectively unless efficient automatic and static solutions to the data partitioning and placement problem become available. Significant progress toward this end has been made in the last few years, but we are still...

chapter

Detecting value-based scalar dependence

Eric Stoltz, Michael Wolfe

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 186-200

Precise value-based data dependence analysis for scalars is useful for advanced compiler optimizations. The new method presented here for flow and output dependence uses Factored Use and Def chains (FUD chains), our interpretation and extension of Static Single Assignment. It is precise with respect to conditional control flow and dependence vectors. Our method detects dependences which are independent...

chapter

Minimal data dependence abstractions for loop transformations

Yi-Qing Yang, Corinne Ancourt, François Irigoin

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 201-216

Many abstractions of program dependences have already been proposed, such as the Dependence Distance, the Dependence Direction Vector, the Dependence Level or the Dependence Cone. These different abstractions have different precision. The minimal abstraction associated to a transformation is the abstraction that contains the minimal amount of information necessary to decide when such a transformation...

chapter

Differences in algorithmic parallelism in control flow and call multigraphs

Vincent Sgro, Barbara G. Ryder

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 217-233

Our parallel hybrid analysis methods facilitate the parallelization of the analysis phase of a software transformation system, by enabling deeper semantic analyses to be accomplished more efficiently than if performed sequentially. Our previous empirical studies profiled these hybrid techniques on the Reaching Definitions problem [LMR91, LR92a, LR92b]. Recently, we have applied our method to the Interprocedural...

chapter

Flow-insensitive interprocedural alias analysis in the presence of pointers

Michael Burke, Paul Carini, Jong-Deok Choi, Michael Hind

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 234-250

Data-flow analysis algorithms can be classified into two categories: flow-sensitive and flow-insensitive. To improve efficiency, flow insensitive interprocedural analyses do not make use of the intraprocedural control flow information associated with individual procedures. Since pointer-induced aliases can change within a procedure, applying known flow-insensitive analyses can result in either incorrect...

chapter

Incremental generation of index sets for array statement execution on distributed-memory machines

S. D. Kaushik, C. -H. Huang, P. Sadayappan

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 251-265

In compiling array statements for distributed-memory machines, efficient generation of local index sets and communication sets is important. Several techniques for enumerating these sets for block-cyclically distributed arrays have been presented in the literature. When sufficient compile-time information is not available, generation of the structures which facilitate efficient enumeration of these...

chapter

A unified data-flow framework for optimizing communication

Manish Gupta, Edith Schonberg, Harini Srinivasan

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 266-282

This paper presents a framework, based on global array data flow analysis, to reduce communication costs in a program being compiled for a distributed memory machine. This framework applies techniques for partial redundancy elimination to available section descriptors, a novel representation of communication involving array sections. With a single framework, we are able to capture numerous optimizations...

chapter

Interprocedural communication optimizations for distributed memory compilation

Gagan Agrawal, Joel Saltz

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 283-299

Managing communication is a difficult problem in distributed memory compilation. When the exact data to be communicated cannot be determined at compile time, communication optimizations can be performed by runtime routines which generate schedule for communication. This leads to two optimization problems: placing communication so that data once communicated can be reused if possible and placing schedule calls...

chapter

Analysis of event synchronization in parallel programs

J. Ramanujam, A. Mathew

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 300-315

The increase in the number and complexity of parallel programs has led to a need for better approaches for synchronization error detection and debugging of parallel programs. This paper presents an efficient and precise algorithm for the detection of nondeterminacy (race conditions) in parallel programs. Non determinacy exists in a program when the program yields different outputs for different runs...

INFONA - science communication portal

Languages and Compilers for Parallel Computing
7th International Workshop Ithaca, NY, USA, August 8–10, 1994 Proceedings

Fine-grain scheduling under resource constraints

Mutation scheduling: A unified approach to compiling for fine-grain parallelism

Compiler techniques for fine-grain execution on workstation clusters using PAPERS

Solving alignment using elementary linear algebra

Detecting and using affinity in an automatic data distribution tool

Array distribution in data-parallel programs

Communication-free parallelization via affine transformations

Finding legal reordering transformations using mappings

A new algorithm for global optimization for parallelism and locality

Polaris: Improving the effectiveness of parallelizing compilers

A formal approach to the compilation of data-parallel languages

The data partitioning graph: Extending data and control dependencies for data partitioning

Detecting value-based scalar dependence

Minimal data dependence abstractions for loop transformations

Differences in algorithmic parallelism in control flow and call multigraphs

Flow-insensitive interprocedural alias analysis in the presence of pointers

Incremental generation of index sets for array statement execution on distributed-memory machines

A unified data-flow framework for optimizing communication

Interprocedural communication optimizations for distributed memory compilation

Analysis of event synchronization in parallel programs

Filter options

Publication date

Keywords

INFONA - science communication portal

Languages and Compilers for Parallel Computing 7th International Workshop Ithaca, NY, USA, August 8–10, 1994 Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Languages and Compilers for Parallel Computing
7th International Workshop Ithaca, NY, USA, August 8–10, 1994 Proceedings