Richard Martin, Amin Vahdat, David Culler, and Thomas Anderson. Effects
of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. Proceedings
of the 24th Annual International Symposium on Computer Architecture (ISCA),
pages 85 - 97. June 1997.
This work provides a systematic study of the impact of communication performance
on parallel applications in a high performance network of workstations. We
develop an experimental system in which the communication latency, overhead, and
bandwidth can be independently varied to observe the effects on a wide range of
applications. Our results indicate that current efforts to improve cluster
communication performance to that of tightly integrated parallel machines
results in significantly improved application performance. We show that
applications demonstrate strong sensitivity to overhead, slowing down by a
factor of 60 on 32 processors when overhead is increased from 3 to 103
microseconds. Applications in this study are also sensitive to per-message
bandwidth, but are surprisingly tolerant of increased latency and lower per-byte
bandwidth. Finally, most applications demonstrate a highly linear dependence to
both overhead and per-message bandwidth, indicating that further improvements in
communication performance will continue to improve application performance.