Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations (113KB)

By Tor Jeremiassen and Susan J. Eggers

ABSTRACT

We have developed compiler algorithms that analyze coarse-grained, explicitly parallel programs and restructure their shared data to minimize the number of false sharing misses. The algorithms analyze the per-process data accesses to shared data, use this information to pinpoint the data structures that are prone to false sharing and choose an appropriate transformation to reduce it.

The algorithms eliminated an average (across the entire workload) of 64% of false sharing misses, and in two programs more than 90%. However, how well the reduction in false sharing misses translated into improved execution time depended heavily on the memory subsystem architecture and previous programmer efforts to optimize for locality. On a multiprocessor with a large cache configuration and high cache miss penalty, the transformations improved the execution time of programmer-unoptimized applications by as much as 60\%. However, on programs where previous programmer efforts to improve data locality had reduced the original amount of false sharing, and on a multiprocessor with a small cache configuration and cache miss penalty, the gains were more modest.

@techreport{JeEg94:FalseSharing,
    author="T.E. Jeremiassen and S.J. Eggers",
    title={Reducing False Sharing on Shared Memory Multiprocessors through
           Compile Time Data Transformations},
    institution="University of Washington",
    number="94-09-05",
    year="1994"
}

pardo@cs.washington.edu