R. Y. Wang, Michael D. Dahlin, Thomas E. Anderson. Experience with a Distributed File System Implementation. University of California Technical Report CSD-98-986. January 1998.
This paper highlights some of the lessons learned during the course of
implementing xFS, a fully distributed file system. xFS is an interesting case
study for two reasons. First, xFS's serverless architecture leads to more
complex distributed programming issues than are faced by traditional
client-server operating system services. Second, xFS implements a complex,
multithreaded service that is tightly coupled with the underlying operating
system. This combination turned out to be quite challenging. On one hand, the
complexity of the system forced us to turn to distributed programming tools
based on formal methods to verify the correctness of our distributed algorithms;
on the other hand the complex interactions with the operating system on
individual nodes violated some of the tools' assumptions, making it difficult to
use them in this environment. Furthermore, the xFS system tested the limits of
abstractions such as threads, RPC, and vnodes that have traditionally been used
in building distributed file systems. Based on our experience, we suggest
several strategies that should be followed by those wishing to build distributed
operating systems services, and we also indicate several areas where programming
tools and operating system abstractions might be improved.