University of Washington
Computer Science & Engineering

Abstracts of Research in Networking, Operating Systems, and Distributed Systems


WebOS: Operating System Services for Wide Area Applications
(T. Anderson)
WebOS provides OS services to wide-area applications, including mechanisms for resource discovery, a global namespace, remote process execution, resource management, authentication, and security. On a single machine, application developers can rely on the local operating system to provide these abstractions. In the wide area, however, application developers are forced to build these abstractions themselves or to do without. This ad-hoc approach wastes programmer effort and system resources. To address these problems, WebOS provides basic operating systems services needed to build applications that are geographically distributed, highly available, incrementally scalable, and dynamically reconfiguring. An application that demonstrates the utility of WebOS is Rent-A-Server, a web server capable of dynamically replicating itself geographically in response to client access patterns.

CRISIS: A Wide Area Security Architecture
(T. Anderson)
CRISIS re-examines wide area authentication and access control from the perspective of programmable control of remote resources. CRISIS explores the systematic application of a number of design principles to the problem of wide area security: redundancy to eliminate single points of attack, caching to improve performance and availability over slow and unreliable wide area networks, fine-grained capabilities and roles to enable lightweight control of privilege, and complete local logging of all evidence used to make each access control decision.

Self-Tuning Network File Systems
(T. Anderson)
File system designs have long been driven by changes in the cost and performance of the underlying hardware. A designer must consider the relative cost per byte of memory versus disk, the relative performance of the CPU versus a network access versus a disk access, the relative magnitudes of seek time, rotational delay, and disk bandwidth, not to mention changes in the workload placed on the file system. These changes pose a tremendous challenge to file system designers. Although it may be possible to design a system that is efficient for today's hardware and workload patterns, file system implementations are often used for decades, long after their design assumptions are no longer valid. In this work, we are exploring a new design principle for file systems, called self-tuning. A self-tuning system (1) measures the physical characteristics of the underlying hardware, (2) measures the workload placed on the file system, and (3) adapts the file system behavior to match. Building a self-tuning system requires new algorithms that monitor the environment and adjust behavior appropriately. The alternative to self-tuning is building a file system for a fixed point on the moving target of hardware and workload characteristics, and either living with the resulting system well past its applicability or rebuilding the system from scratch every few years.

Memory System Analysis
(Baer, Bershad, Karlin, H. Levy)
Our research group is investigating techniques that use the operating system to improve memory system performance. We rely on a combination of simple hardware support and operating system modifications to monitor the dynamic behavior of applications. These monitoring mechanisms incur a small overhead at runtime, but the information they collect can be used to identify sources of memory system delays such as cache misses and TLB misses. By identifying and resolving these bottlenecks, we not only pay for the overhead of the monitoring mechanisms, but also significantly improve overall system performance. Click here for more information.

SPIN -- An Extensible Microkernel for Application-Specific Operating System Services
(
Bershad, Chambers, Eggers)
Application domains, such as multimedia, databases and parallel computing, require operating system services with high performance and high functionality. Existing operating systems provide fixed interfaces and implementations to system services and resources. This makes them inappropriate for applications whose resource demands and usage patterns are poorly matched by the services provided. The SPIN operating system enables system services to be defined in an application-specific fashion through an extensible microkernel. It offers fine-grained control over a machine's logical and physical resources to applications through run-time adaptation of the system to application requirements. The application-specific services are installed into the kernel at run-time. A trusted compiler and pointer-safe language run-time environment ensure that the installed sequences do not violate system integrity. The system runs on DEC ALPHA workstations and Intel PCs. Click here for more information.

Instrumentation and Optimization of WIN32/Intel Executables with Etch
(Bershad, H. Levy)
We have developed an application program performance evaluation and optimization system called Etch for Intel x86 platforms running the Windows/NT operating system. The system allows you to annotate existing binaries with arbitrary instructions (for example, to trace, or perform coverage analysis), or to rewrite an existing binary so that it executes more efficiently. Etch works directly on Win32 x86 executables. It does not require program source code for either measurement or optimization. Click here for more details.

Large Scale Mail Services
(Bershad, H. Levy, T. Anderson)
To better understand the types and structures of services appropriate for large scale clusters, we are developing Porcupine, a highly available, highly scalable, low cost cluster-based mail server. The goal of Porcupine is to provide mail service for 100 million users, or roughly one billion mail messages per day. at a cost of $3/year/user. Porcupine is being developed on a PC-based cluster.

Kimera -- A Factored Java Architecture
(Bershad)
We are implementing a new Java architecture based on factored components for security, performance, and scalability. By locating crucial Java Virtual Machine services at trust-domain boundaries, such as Intranet firewalls, we can make safety enforcement mandatory, ease security management and reduce the processing requirements of Java endpoints. Under such a centralized security architecture, the trusted computing base is minimal and consists of small and simple components whose security can be more readily assured.

SPINE -- A Safe Programmable Integrated Network Environment
(Bershad)
SPINE provides a safe execution environment for applications to directly compute on the network interface (NI). Several network interfaces, for example the Myrinet, as well as devices from the new I2O architecture, provide the infrastructure necessary to compute on the NI itself. SPINE uses ideas from the SPIN extensible operating system (safe language, extensible interfaces, and resource management) and applies these to the NI, which allows applications to customize their interaction with the network. The motivation is to migrate application-specific functionality directly to the NI, thereby reducing I/O related data and control transfers to the host system.

Security in Extensible Systems
(Bershad)
Extensible systems, such as SPIN or Java, raise new security concerns. In these systems, units of code, or extensions, can be added to a running system in almost arbitrary fashion. Extensions closely interact through low latency, but type-safe interfaces to form a tightly integrated system. As extensions can come from arbitrary sources, not all of whom can be trusted to conform to an organization's security policy, such structuring raises the question of how security constraints are enforced in an extensible system. In the SPIN extensible operating system built at the University of Washington, we are experimenting with an access control mechanism to address this problem. Our access control mechanism decomposes access control into a policy-neutral enforcement manager and a security policy manager, and it is transparent to extensions in the absence of security violations. It structures the system into protection domains, enforces the protection domains through access control checks, and performs auditing of system operations. It works by inspecting extensions for their types and operations to determine which abstraction require protection, and by redirecting procedure or method invocations to inject access control operations into the system. By only imposing those dynamic access control operations that are absolutely necessary, we intend to minimize the performance overhead of system security. Click here for more details.

SAAM II Development
(Golde)
SAAM II is a sophisticated computer program for the analysis of certain biomedical data in the context of a proposed model. Its development is an interdisciplinary effort, involving faculty, students, and staff from Computer Science and Engineering, Bioengineering, and the Applied Physics Laboratory. The program encompasses both simulation and fitting of parameters to experimental data. The equations specifying the model can be algebraic or first order ordinary linear or nonlinear differential equations. One of the key elements of the program is that the differential equations are automatically derived from a proposed graphical model of a physical system. The program also performs a statistical analysis of the results.

Design and Analysis of Competitive Algorithms for Online Problems
(Karlin)
Competitive algorithms for online problems are guaranteed to have performance which is close to that of the optimal offline performance on each input. We are in the process of designing and implementing such algorithms for a number of basic systems problems, including cache management algorithms for exploiting spatial locality and bypassing (with Baer), cooperative caching and prefetching in PC clusters and wider area networks (with Levy), and other resource allocation problems. Our focus is on the empirical evaluation of algorithms with provable performance guarantees, with an eye towards algorithms that adapt to observed behavior patterns. In addition, we are looking at a number of more theoretical research questions, revolving around the long-term goal of coming up with general principles that characterize competitive algorithms for online problems.

Vector Quantization and Wavelet Image Compression
(Ladner, Riskin (Electrical Engineering))
The use of digital images is rapidly expanding due to the emerging technologies in numerous fields, such as remote sensing and HDTV. The aim of this project is to provide state of the art, commercially available, tools for storing, transmitting, and analyzing image data. The goal of this project is to develop new algorithms based on VQ and wavelets for different image processing tasks. These are document image processing, document image browsing, and compression for fast transmission of ice information location images.

Opal: A Single-Address-Space Operating System
(H. Levy)
The Opal project is exploring a new operating system structure, tuned to the needs of complex CAD-style applications, where many cooperating programs share a large persistent object database. In Opal, all code and data exist within a single, huge, shared address space. Protection is independent of the single address space; each Opal thread executes within a protection domain that defines which virtual pages it has the right to access. The single address space enhances sharing and cooperation, because addresses have a unique (for all time) interpretation. Thus, pointer-based data structures can be directly communicated and shared between programs at any time, and can be stored directly on secondary storage without the need for translation. This structure is simplified by the availability of a large address space, such as those provided by the DEC Alpha, MIPS R4000, HP/PA-RISC, IBM RS6000, and Sun UltraSPARC. We are currently building an Opal prototype, in conjunction with The Boeing Company, which is testing Opal's structure as a basis for their next-generation airplane CAD system. Click here for more information.

Global Memory Management in Workstation and PC Clusters
(H. Levy, Karlin)
Our work involves the general issue of managing resources globally in high-speed local-area networks. The first phase has been the design and implementation of algorithms that manage memory globally in a cluster. In our system, called GMS (for global memory system), active nodes can use available physical memory pages of "idle" nodes as backing store, reducing the fetch time on those pages by an order of magnitude. We have implemented our system at the lowest level of on the OSF operating system, underneath of both the file and virtual memory systems. GMS currently runs on an ATM-connected network of DEC Alpha workstations. Click here for more information.

Real-Time Systems
(Shaw)
Our research has been concerned with models, techniques, and tools for specifying, analyzing, and generating software for real-time applications. The emphasis is on requirements and design specifications; fast prototyping; real-time operating systems; real-time programming; and methods for describing, predicting, and verifying the timing behavior of programming languages and systems. Current research includes: using execution-time bounds and assertional logic to state and prove timing properties of higher-level language programs; building software tools that predict the deterministic execution times of programs; refining, implementing, and testing a specification method, called communicating real-time state machines; and studying the use of time-stamped event histories in real-time programming. Two themes of the research are the incorporation of time as a first-class specification and programming object, and the analysis and exploitation of concurrency.

DDDDrraw - Distributed 3D Real-time Rendering at Washington
(Zahorjan)
Real-time 3D rendering is an essential component of an increasing number of multimedia applications (e.g., VRML, virtual reality, and geographical information systems). However, despite recent performance gains, the capability of a single workstation to render in real-time is still quite limited. In the DDDDrraw project (pronounced draw), we are working to improve real-time rendering performance through the use of distributed rendering on a cluster of workstations. In this context, improving performance means extending the complexity of the scenes that can be rendered at a sufficiently fast as well as predictable frame rate.