System Support for Scalable Network Servers

Vivek S. Pai, Peter Druschel, Willy Zwaenepoel

Department of Computer Science
Rice University

We are building self-scaling Internet servers with commodity hardware and modular software components, using a novel system architecture designed to support scalable and flexible network servers. Three novel ideas make our proposed network server architecture unique, practical and efficient:

  1. A (single-node) unified buffering and caching system, called IO-Lite, that avoids copying and double buffering of I/O data, thus increasing throughput and reducing memory requirements. Moreover, IO-Lite allows high-performance communication between different server software modules and encourages modular composition.
  2. A transparent multi-node extension of the IO-Lite unified buffering and caching system, eliminating the need for the server to explicitly manage what data resides on what server node. As a result, disk data need not be fully replicated at each node, and the main memories of all the server nodes form a globally unified disk cache of a large effective size.
  3. Adaptive, locality-aware load distribution spreads the workload over the set of server nodes, while at the same time preserving locality in the request sequence to improve overall performance. Based on runtime feedback, the degree of data replication in both primary memory (node caches) and secondary storage (node disk systems) is automatically adjusted to match the offered workload.

Our architecture enables modular server software components. This implies, for instance, that programs generating dynamic content run in isolation from the server and from each other. Besides the obvious advantages of reliability and maintainability, a modular software architecture allows us to adapt quickly to the frequent changes in the Internet environment, driven by the need for new features or improved performance. Current or proposed network server architectures often abandon modularity, because poor performance of the interprocess communication system leads to unacceptable throughput. Safe languages, such as Java or Perl, provide only a partial solution to this problem, because content providers often want to use existing applications, almost invariably written in unsafe languages.

High-volume Internet servers are based either on costly, dedicated supercomputers (e.g., the official webserver for the Atlanta Olympic Games), or they use clusters of independent workstation/PC servers. Unlike supercomputers, cluster servers are able to ride the unique price/performance curve driven by the high-volume workstation/PC market. With cluster servers, incoming requests are distributed to server nodes without regard for a request's target content. This naive distribution of requests and the lack of integration among the nodes of a cluster limit performance and scalability. Typically, the entire server database must be mirrored at each node, and the effective size of the server cache cannot exceed the size of each node's main memory. Our proposed architecture addresses these problems.

Current cluster servers require substantial performance-related configuration and administration. Distributed IO-Lite gives each server node transparent access to the database, irrespective of the physical location of the data and the degree of replication. An adaptive load control system exploits this flexibility by dynamically adjusting the degree of data replication in both primary and secondary storage, based on runtime feedback. As a result, performance-related tuning is fully automated. System capacity can be scaled simply by adding hardware.