By Craig Anderson and Jean-Loup Baer.
ABSTRACT
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory architecture that serves all applications well is not easy. However, because tolerating or reducing memory latency is a priority in effective parallel processing, it is important to explore new techniques to reduce memory traffic.
In this paper, we describe a snoopy cache coherence protocol that uses a large sized transfer block (to take advantage of spatial locality) while using a small coherence block in order to avoid false sharing. To further illustrate the protocol, we present an example of its workings. We then present the results of simulating our protocol on 5 applications that exhibit a variety of reference patterns. We find that our protocol effectively takes advantage of spatial locality while avoiding the increase in false sharing that often occurs when using large line sizes.
@techreport{Andersonb94,
author="Craig Anderson and Jean-Loup Baer",
title="Design and Evaluation of a Subblock Cache Coherence Protocol
for Bus-Based Multiprocessors",
institution="University of Washington",
year="1994",
number="94-05-02"
}