Static Analysis of Barrier Synchronization in Explicitly Parallel Programs (186KB)

By Tor Jeremiassen and Susan J. Eggers

ABSTRACT

Many coarse-grained, explicitly parallel programs execute in phases delimited by barriers to preserve sets of cross process data dependencies. One of the major obstacles to optimizing these programs is the necessity to conservatively assume that any two statements in the program may execute concurrently. Consequently, compilers fail to take advantage of opportunities to apply optimizing transformations, particularly those designed to improve data locality, both within and across the phases of the program.

We present a simple and efficient compile time algorithm that uses the presence of barriers to perform non-concurrency analysis on coarse-grain, explicitly parallel programs. It works by dividing the program into a set of phases and computing the control flow between them. Each phase consists of one or more sequences of program statements that are delimited by barrier synchronization events and can execute concurrently. We show that the algorithm performs perfectly on all but one of our benchmarks.

The barrier analysis algorithm has multiple uses. We have implemented it as part of a set of false sharing detection and transformation algorithms in a source-to-source restructurer that reduces cache coherency overhead in bus-based, shared memory multiprocessors. Including the barrier synchronization analysis algorithm produced significant improvements in cache performance for a benchmark that showed only small to moderate improvements with the previously available static analysis.

@inproceedings{JeEg94:PACT,
    author="T.E. Jeremiassen and S.J. Eggers",
    title={Static Analysis of Barrier Synchronization in Explicitly Parallel
           Programs},
    booktitle={International Conference on Parallel Architectures and
    Compilation Techniques},
    month = "August",
    year="1994"
}

pardo@cs.washington.edu