University of Washington

Department of Computer Science & Engineering



1998/99 Research Abstracts


Abstracts of Research in Embedded Systems and Reconfigurable Computing


Embedded System Synthesis
(Borriello)
Automated synthesis of the hardware and software elements of embedded controllers is critical to rapidly evaluating the many design choices faced by designers of task-specific computing and communication devices. We are currently developing a synthesis system, as part of the Chinook project, that will automatically generate many of the low-level software needed to coordinate the elements of an embedded system including detailed scheduling to meet real-time constraints. Thus, our principal interest is to make embedded software more portable by automatically generating the most error-prone and system-specific portions of the software. This includes the automatic customization of generic device drivers and application-specific, real-time kernels to execute the software on multiple processors.

Component-Based Design
(Borriello)
Modern software design involves the reuse of a variety of existing components. The elements encapsulate functionality associated with a particular task such as: a web browser, management of a wireless connection, user interface widgets, etc.. Currently, these components are provided to designers via an API abstraction for a particular platform or system architecture. This style supports data-composition well (i.e., the transfer of data packets in a data-flow style) but does not do as well for control-composition (where the internal state changes of one component must also effect others). We are developing a specification methodology that supports both data and control composition at a high-level of abstraction and supports the automatic synthesis of all the coordinating code (that handles communication, synchronization, and consistency between the components). Our goal is to be make software components more easily reusable and retargetable and optimize them to different system architectures (e.g., use the same specification to generate a family of applications including those that may run solely on a workstation, as part of an embedded system, or distributed over a network). We are currently developing an integrated development environment that supports this style of specification and software design.

Information Appliances and Wearable Computing
(Borriello)
The future of computing, for the majority of people, is not going to be on the desktop. Rather, it will be in devices unobtrusively embedded in their homes, cars, and carried on their person. There are interesting issues in constructing such devices that related to their connectivity, user input/output modalities, and power sources. We are actively working on speech recognition and synthesis (as may be used in a cellular phone that connects to information services on the web) and on uses of small personal digital assistants in home and automobile environments (for example, to remote control home appliances and utilize GPS information for coordinating routes and modes of transportation). Our aim is to prototype interesting new devices and use them as drivers for our work in component-based design and embedded system synthesis.

Silicon Neuroscience
(Diorio [home page, research page])
Nervous tissue (e.g. the brain) computes using hardware and algorithms that are fundamentally different from our digital computers. Our research group believes that we can learn new principles of computing by studying neurobiology. Our goal is to investigate the foundations of neuronal computation, and to build silicon integrated circuits that compute using these principles. We call our approach silicon neuroscience: the development of neurally inspired silicon-learning systems. We have already developed, in a standard CMOS process, a family of single-transistor devices that we call synapse transistors. Like neural synapses, synapse transistors learn locally. And they do so at biologically realistic timescales. Although we do not believe that a single transistor can model the complex behavior of a neural synapse completely, our synapse transistors do learn from their inputs. Using them, we intend to develop silicon systems that learn from their environment.

Pipelined-Systolic RADAR Signal Processing
(Diorio, Sahr (Electrical Engineering))
Researchers in the Radar Remote Sensing Laboratory (rcs.ee.washington.edu/SPP/) are building a passive, bistatic radar for real-time measurements of ionospheric fluctuations. This RADAR requires a processing throughput of 10-100GOPS for a two-receiver system; the computational burden increases as the square of the number of receivers (additional receivers enable angle-of-arrival estimation and target imaging). We are optimizing the signal-processing algorithms for integrated circuit implementation, and we are developing custom, pipelined-systolic CMOS ICs with the requisite throughput to enable real-time ionospheric measurement.

Action-Potential Based Computing
(Diorio, Atlas (Electrical Engineering))
Conventional digital-logic systems communicate using discrete, binary voltages, transmitted on passive metal wire, and are synchronized by a system clock. By contrast, neurons, the brain's primary computing elements, communicate using millisecond-long impulse-like voltage waveforms called action potentials, transmitted on nondispersive active wire (called axon), and are not clocked. Recent neurophysiological data lends credence to computing models that use action-potential timing, and tuned wire delays, to enable self-timed signal processing. We are investigating the relationship between delay-based computing and time-frequency analysis, and are beginning to develop hardware models for computing using delta-function impulses and wire delays.

A Configurable ASIC Compiler for Compute-Intensive Applications
(Ebeling)
ASICs can provide a large price-performance advantage over DSPs for many compute-intensive applications. But ASIC technology is not appropriate for applications that present a moving target in terms of evolving standards, multiple uses, or algorithmic improvements. For such applications, the implementation must be reprogrammable to avoid premature obsolescence. We are working on a hardware compiler that generates configurable ASICs from high-level descriptions of compute-intensive tasks. A configurable ASIC uses a combination of programmed control and static reconfiguration of hardware resources to achieve flexibility without compromising performance. Depending on the intended application, the configurable ASIC can be built with more or less flexibility depending on the required range of in situ reprogrammability.

Field-Programmable Gate Arrays
(Ebeling)
Reconfigurable hardware is rapidly increasing in popularity and complexity. Our research in FGPAs falls in three different categories. First, we have developed and patented a new FPGA architecture, called Triptych, that achieves 2 to 4 times higher logic density than existing commercial FPGAs. Second, we have developed a set of architecture-independent, performance-driven mapping tools for exploring new FPGA architectures. These tools have been retargeted to a new IBM FPGA architecture, where they have achieved better performance and density than industry tools. Third, we are now investigating a new coarse-grained FPGA architecture intended for computation-intensive applications, such as digital signal processing. We expect to achieve an order of magnitude improvement in density over current FPGA architectures.

Architectural Retiming
(Ebeling)
Pipelining is probably the most common technique for improving system performance, and retiming can be used to automatically optimize the performance obtained by pipelining. However, in many situations pipelining and retiming are not possible, either because of latency constraints or because a feedback cycle makes pipelining impossible. We are investigating a set of techniques that allow pipelining in these cases by changing the architecture of the circuit. Whereas retiming preserves the function of the circuit exactly on a cycle by cycle basis, changing only the position of registers in the circuit, architectural retiming relies on logic transformations, preserving functionality only in terms of input/output sequences. We are developing a CAD tool called ART which applies these transformations, relying initially on the designer to evaluate and choose those that achieve the best performance.

Chaos Routing
(Snyder, Ebeling, Bolding)
The Chaos router is a randomizing, non-minimal adaptive packet router that has been shown to have higher throughput and lower latency than the state of the art oblivious routers for both two dimensional (e.g., mesh and torus) networks and hypercube networks on uniform and non-uniform traffic patterns. Randomization is crucial, since it allows the Chaos router to be naturally livelock free without any special circuitry (i.e., packets do not circulate forever). The Chaos router has been implemented in 1.2u CMOS with a speed of 66MHz, the speed limit for this technology, using a pipelined design. Research has included processor/network interface designs, characterizations of Chaos routing under various workloads, studies of fault tolerance issues in adaptive routing, theoretical characterizations of routing algorithms, and a general study of high performance routing algorithms. Currently, the 0.6u Chaos chip is back from fab, and being incorporated into a Chaos LAN design. Click here for more information.


Abstracts of Research in Computer Architecture


Memory Hierarchy Studies
(Baer)
Processor speed has steadily improved during the last decade but memory latency and bandwidth have been progressing at a slower pace. Several memory organizations, as well as techniques to reduce or tolerate high memory latencies are investigated for single as well as multiprocessor systems. Particular projects include: (1) monitoring the execution characteristics of "real-world" (i.e., Wintel) applications for their potential optimizations in (maybe) non-conventional memory systems (2) code and data reorganization for improving instruction and data cache, TLB and Branch target buffer efficiencies. (3) design and evaluation of user-primitives for applications running on clustered shared-memory multiprocessors,

Simultaneous Multithreading
(Eggers, H. Levy)
Simultaneous multithreading (SMT) is a processor design that permits several independent threads to issue instructions to a (wide) superscalar's functional units in a single cycle. Its microarchitecture is a fairly straightforward extension of today's wide-issue, dynamically scheduling processors, which should ease its potential transition in the commercial marketplace. Our experiments to date have shown that, relative to several other architectures that are also competing to be the design paradigm in the next few years, SMT has higher program speedups and useful instruction throughput on a variety of multi-threaded workloads, while barely impacting single-thread performance. Our early studies exposed the performance advantages of simultaneously issuing instructions from several threads and dynamically sharing almost all hardware resources among those threads. Much of this research prompted advances in architectural techniques (examples are effective multithread instruction fetch hardware and compiler- and operating systems-directed support for early remapping of renaming registers) and using existing software techniques in a different manner (examples include applying different policies for OS page mapping and several compiler optimizations). Our future research will investigate using SMTs in a parallel processing environment, including the development of SMT-specific synchronization mechanisms, revisiting operating policies that drive SMT's thread-shared hardware resources and expanding our experimental workload to include real, threaded applications. Our first foray into the latter area was an investigation of SMT performance on commercial workloads.

Chaos Routing
(Snyder, Ebeling, Bolding)
The Chaos router is a randomizing, non-minimal adaptive packet router that has been shown to have higher throughput and lower latency than the state of the art oblivious routers for both two dimensional (e.g., mesh and torus) networks and hypercube networks on uniform and non-uniform traffic patterns. Randomization is crucial, since it allows the Chaos router to be naturally livelock free without any special circuitry (i.e., packets do not circulate forever). The Chaos router has been implemented in 1.2u CMOS with a speed of 66MHz, the speed limit for this technology, using a pipelined design. Research has included processor/network interface designs, characterizations of Chaos routing under various workloads, studies of fault tolerance issues in adaptive routing, theoretical characterizations of routing algorithms, and a general study of high performance routing algorithms. Currently, the 0.6u Chaos chip is back from fab, and being incorporated into a Chaos LAN design.

A Configurable ASIC Compiler for Compute-Intensive Applications
(Ebeling)
ASICs can provide a large price-performance advantage over DSPs for many compute-intensive applications. But ASIC technology is not appropriate for applications that present a moving target in terms of evolving standards, multiple uses, or algorithmic improvements. For such applications, the implementation must be reprogrammable to avoid premature obsolescence. We are working on a hardware compiler that generates configurable ASICs from high-level descriptions of compute-intensive tasks. A configurable ASIC uses a combination of programmed control and static reconfiguration of hardware resources to achieve flexibility without compromising performance. Depending on the intended application, the configurable ASIC can be built with more or less flexibility depending on the required range of in situ reprogrammability.


Abstracts of Research in Networking, Operating Systems, and Distributed Systems


WebOS: Operating System Services for Wide Area Applications
(T. Anderson)
WebOS provides OS services to wide-area applications, including mechanisms for resource discovery, a global namespace, remote process execution, resource management, authentication, and security. On a single machine, application developers can rely on the local operating system to provide these abstractions. In the wide area, however, application developers are forced to build these abstractions themselves or to do without. This ad-hoc approach wastes programmer effort and system resources. To address these problems, WebOS provides basic operating systems services needed to build applications that are geographically distributed, highly available, incrementally scalable, and dynamically reconfiguring. An application that demonstrates the utility of WebOS is Rent-A-Server, a web server capable of dynamically replicating itself geographically in response to client access patterns.

CRISIS: A Wide Area Security Architecture
(T. Anderson)
CRISIS re-examines wide area authentication and access control from the perspective of programmable control of remote resources. CRISIS explores the systematic application of a number of design principles to the problem of wide area security: redundancy to eliminate single points of attack, caching to improve performance and availability over slow and unreliable wide area networks, fine-grained capabilities and roles to enable lightweight control of privilege, and complete local logging of all evidence used to make each access control decision.

Self-Tuning Network File Systems
(T. Anderson)
File system designs have long been driven by changes in the cost and performance of the underlying hardware. A designer must consider the relative cost per byte of memory versus disk, the relative performance of the CPU versus a network access versus a disk access, the relative magnitudes of seek time, rotational delay, and disk bandwidth, not to mention changes in the workload placed on the file system. These changes pose a tremendous challenge to file system designers. Although it may be possible to design a system that is efficient for today's hardware and workload patterns, file system implementations are often used for decades, long after their design assumptions are no longer valid. In this work, we are exploring a new design principle for file systems, called self-tuning. A self-tuning system (1) measures the physical characteristics of the underlying hardware, (2) measures the workload placed on the file system, and (3) adapts the file system behavior to match. Building a self-tuning system requires new algorithms that monitor the environment and adjust behavior appropriately. The alternative to self-tuning is building a file system for a fixed point on the moving target of hardware and workload characteristics, and either living with the resulting system well past its applicability or rebuilding the system from scratch every few years.

Memory System Analysis
(Baer, Bershad, Karlin, H. Levy)
Our research group is investigating techniques that use the operating system to improve memory system performance. We rely on a combination of simple hardware support and operating system modifications to monitor the dynamic behavior of applications. These monitoring mechanisms incur a small overhead at runtime, but the information they collect can be used to identify sources of memory system delays such as cache misses and TLB misses. By identifying and resolving these bottlenecks, we not only pay for the overhead of the monitoring mechanisms, but also significantly improve overall system performance. Click here

for more information.

SPIN -- An Extensible Microkernel for Application-Specific Operating System Services
(
Bershad, Chambers, Eggers)
Application domains, such as multimedia, databases and parallel computing, require operating system services with high performance and high functionality. Existing operating systems provide fixed interfaces and implementations to system services and resources. This makes them inappropriate for applications whose resource demands and usage patterns are poorly matched by the services provided. The SPIN operating system enables system services to be defined in an application-specific fashion through an extensible microkernel. It offers fine-grained control over a machine's logical and physical resources to applications through run-time adaptation of the system to application requirements. The application-specific services are installed into the kernel at run-time. A trusted compiler and pointer-safe language run-time environment ensure that the installed sequences do not violate system integrity. The system runs on DEC ALPHA workstations and Intel PCs. Click here for more information.

Instrumentation and Optimization of WIN32/Intel Executables with Etch
(Bershad, H. Levy)
We have developed an application program performance evaluation and optimization system called Etch for Intel x86 platforms running the Windows/NT operating system. The system allows you to annotate existing binaries with arbitrary instructions (for example, to trace, or perform coverage analysis), or to rewrite an existing binary so that it executes more efficiently. Etch works directly on Win32 x86 executables. It does not require program source code for either measurement or optimization. Click here for more details.

Large Scale Mail Services
(Bershad, H. Levy, T. Anderson)
To better understand the types and structures of services appropriate for large scale clusters, we are developing Porcupine, a highly available, highly scalable, low cost cluster-based mail server. The goal of Porcupine is to provide mail service for 100 million users, or roughly one billion mail messages per day. at a cost of $3/year/user. Porcupine is being developed on a PC-based cluster.

Kimera -- A Factored Java Architecture
(Bershad)
We are implementing a new Java architecture based on factored components for security, performance, and scalability. By locating crucial Java Virtual Machine services at trust-domain boundaries, such as Intranet firewalls, we can make safety enforcement mandatory, ease security management and reduce the processing requirements of Java endpoints. Under such a centralized security architecture, the trusted computing base is minimal and consists of small and simple components whose security can be more readily assured.

SPINE -- A Safe Programmable Integrated Network Environment
(Bershad)
SPINE provides a safe execution environment for applications to directly compute on the network interface (NI). Several network interfaces, for example the Myrinet, as well as devices from the new I2O architecture, provide the infrastructure necessary to compute on the NI itself. SPINE uses ideas from the SPIN extensible operating system (safe language, extensible interfaces, and resource management) and applies these to the NI, which allows applications to customize their interaction with the network. The motivation is to migrate application-specific functionality directly to the NI, thereby reducing I/O related data and control transfers to the host system.

Security in Extensible Systems
(Bershad)
Extensible systems, such as SPIN or Java, raise new security concerns. In these systems, units of code, or extensions, can be added to a running system in almost arbitrary fashion. Extensions closely interact through low latency, but type-safe interfaces to form a tightly integrated system. As extensions can come from arbitrary sources, not all of whom can be trusted to conform to an organization's security policy, such structuring raises the question of how security constraints are enforced in an extensible system. In the SPIN extensible operating system built at the University of Washington, we are experimenting with an access control mechanism to address this problem. Our access control mechanism decomposes access control into a policy-neutral enforcement manager and a security policy manager, and it is transparent to extensions in the absence of security violations. It structures the system into protection domains, enforces the protection domains through access control checks, and performs auditing of system operations. It works by inspecting extensions for their types and operations to determine which abstraction require protection, and by redirecting procedure or method invocations to inject access control operations into the system. By only imposing those dynamic access control operations that are absolutely necessary, we intend to minimize the performance overhead of system security. Click here for more details.

SAAM II Development
(Golde)
SAAM II is a sophisticated computer program for the analysis of certain biomedical data in the context of a proposed model. Its development is an interdisciplinary effort, involving faculty, students, and staff from Computer Science and Engineering, Bioengineering, and the Applied Physics Laboratory. The program encompasses both simulation and fitting of parameters to experimental data. The equations specifying the model can be algebraic or first order ordinary linear or nonlinear differential equations. One of the key elements of the program is that the differential equations are automatically derived from a proposed graphical model of a physical system. The program also performs a statistical analysis of the results.

Design and Analysis of Competitive Algorithms for Online Problems
(Karlin)
Competitive algorithms for online problems are guaranteed to have performance which is close to that of the optimal offline performance on each input. We are in the process of designing and implementing such algorithms for a number of basic systems problems, including cache management algorithms for exploiting spatial locality and bypassing (with Baer), cooperative caching and prefetching in PC clusters and wider area networks (with Levy), and other resource allocation problems. Our focus is on the empirical evaluation of algorithms with provable performance guarantees, with an eye towards algorithms that adapt to observed behavior patterns. In addition, we are looking at a number of more theoretical research questions, revolving around the long-term goal of coming up with general principles that characterize competitive algorithms for online problems.

Vector Quantization and Wavelet Image Compression
(Ladner, Riskin (Electrical Engineering))
The use of digital images is rapidly expanding due to the emerging technologies in numerous fields, such as remote sensing and HDTV. The aim of this project is to provide state of the art, commercially available, tools for storing, transmitting, and analyzing image data. The goal of this project is to develop new algorithms based on VQ and wavelets for different image processing tasks. These are document image processing, document image browsing, and compression for fast transmission of ice information location images.

Opal: A Single-Address-Space Operating System
(H. Levy)
The Opal project is exploring a new operating system structure, tuned to the needs of complex CAD-style applications, where many cooperating programs share a large persistent object database. In Opal, all code and data exist within a single, huge, shared address space. Protection is independent of the single address space; each Opal thread executes within a protection domain that defines which virtual pages it has the right to access. The single address space enhances sharing and cooperation, because addresses have a unique (for all time) interpretation. Thus, pointer-based data structures can be directly communicated and shared between programs at any time, and can be stored directly on secondary storage without the need for translation. This structure is simplified by the availability of a large address space, such as those provided by the DEC Alpha, MIPS R4000, HP/PA-RISC, IBM RS6000, and Sun UltraSPARC. We are currently building an Opal prototype, in conjunction with The Boeing Company, which is testing Opal's structure as a basis for their next-generation airplane CAD system. Click here for more information.

Global Memory Management in Workstation and PC Clusters
(H. Levy, Karlin)
Our work involves the general issue of managing resources globally in high-speed local-area networks. The first phase has been the design and implementation of algorithms that manage memory globally in a cluster. In our system, called GMS (for global memory system), active nodes can use available physical memory pages of "idle" nodes as backing store, reducing the fetch time on those pages by an order of magnitude. We have implemented our system at the lowest level of on the OSF operating system, underneath of both the file and virtual memory systems. GMS currently runs on an ATM-connected network of DEC Alpha workstations. Click here for more information.

Real-Time Systems
(Shaw)
Our research has been concerned with models, techniques, and tools for specifying, analyzing, and generating software for real-time applications. The emphasis is on requirements and design specifications; fast prototyping; real-time operating systems; real-time programming; and methods for describing, predicting, and verifying the timing behavior of programming languages and systems. Current research includes: using execution-time bounds and assertional logic to state and prove timing properties of higher-level language programs; building software tools that predict the deterministic execution times of programs; refining, implementing, and testing a specification method, called communicating real-time state machines; and studying the use of time-stamped event histories in real-time programming. Two themes of the research are the incorporation of time as a first-class specification and programming object, and the analysis and exploitation of concurrency.

DDDDrraw - Distributed 3D Real-time Rendering at Washington
(Zahorjan)
Real-time 3D rendering is an essential component of an increasing number of multimedia applications (e.g., VRML, virtual reality, and geographical information systems). However, despite recent performance gains, the capability of a single workstation to render in real-time is still quite limited. In the DDDDrraw project (pronounced draw), we are working to improve real-time rendering performance through the use of distributed rendering on a cluster of workstations. In this context, improving performance means extending the complexity of the scenes that can be rendered at a sufficiently fast as well as predictable frame rate.


Abstracts of Research in Programming Systems


Design of Object-Oriented Languages
(Chambers)
Object-oriented languages promise to make programs easier to write, to modify, and to extend. However, challenges remain in developing expressive language features that support static, modular reasoning and typechecking. We have been working on a series of theoretical and practical language designs, including the Cecil language and its successor Diesel, with the goal of marrying espressive features like first-class functions, programmable multiple dispatch, and classless objects with first-class modules and strong, static, separate typechecking. We have implemented the Cecil language and written real, large programs in it, in order to gain practical exprience with our language designs and to suggest future ideas.

Efficient Implementation of High-Level Languages
(Chambers)
High-level languages can be powerful to program in, but they usually run more slowly than lower-level languages. We are developing implementation strategies for optimizing high-level languages, particularly those with object-oriented features and first-class functions, in an effort to close the performance gap between high-level and low-level languages. We are studying static analyses, at the intraprocedural, interprocedural, and whole-program level; dynamic profile-guided optimizations; and control flow and data layout transformations. We also are studying which optimizations can best be performed at which stage of a program's lifetime (at traditional compile-time of individual files, or at link-time, or at run-time), and also how a single optimization can be divided into phases, each performed at a different stage.

We have implemented many of these ideas in a language-independent compiler framework, named Vortex. Vortex currently includes front-ends for Java, C++, Smalltalk, Modula-3, and Cecil, and we study the effectiveness and cost of our techniques on large programs in each of these languages. In addition, Vortex itself is a 100,000-line Cecil program, providing practical feedback to the object-oriented language design project.

Dynamic Compilation
(Chambers, Eggers)
Dynamic compilation enables constant-based optimizations on the values of invariant data computed at run-time. Using the values of these run-time constants, a dynamic compiler can eliminate their memory loads, perform constant propagation and folding, remove branches they determine, and fully unroll loops they bound. However, the performance benefits of the more efficient, dynamically-compiled code are offset by the run-time cost of the dynamic compile. Our approach to dynamic compilation strives for both fast dynamic compilation and high-quality dynamically-compiled code: the programmer annotates regions of the programs that should be compiled dynamically; a static optimizing compiler automatically produces machine-code templates that have been pre-optimized by an analysis that identifies which variables will be constant at run-time; and a simple, dynamic compiler copies the templates, patching in the computed values of the run-time constants, to produce optimized, executable code. Our work targets general-purpose, imperative programming languages, initially C. Initial experiments applying dynamic compilation to C programs have produced speed-ups up to nearly a factor of 9. In the short term we plan to extend the framework in several dimensions: to track the flow of run-time constants across procedure boundaries, to develop special forms of additional optimizations, such as register allocation and instruction scheduling, that can be applied quickly at run-time, to investigate the effect on dynamic compilation performance of different microarchitecture designs, to more fully automate the selection of run-time constants and dynamic regions, and to extend our benchmark suite to include a wide range of application areas and larger programs.

Compiling for Simultaneous Multithreading
(Eggers, H. Levy)
Simultaneous Multithreading processors (SMT) is a processor design that permits several independent threads to issue instructions to a (wide) superscalar's functional units in a single cycle. Its microarchitecture is a fairly straightforward extension of today's wide-issue, dynamically scheduling processors. Our experiments have shown that, relative to several other architectures that are competing to be the design paradigm in the next few years, SMT has higher program speedups and useful instruction throughput on a variety of multi-threaded workloads, while barely impacting single-thread performance. Since an SMT architecture changes some fundamental architectural assumptions on which several machine-dependent compiler optimizations rely, we speculated that it also changes the use of these optimizations. So far our experiments have validated this hypothesis: loop distribution, loop tiling, and software speculation should all be applied differently (or not at all!) on an SMT processor than for their original, targetted machines. We suspect the same is true for software prefetching, predicated execution and software pipelining.

ZPL: A Parallel Array Language
(Snyder)
ZPL is a portable, high performance parallel programming language designed for science and engineering applications. Portability means that ZPL runs well on all contemporary parallel machines, including both shared memory and message passing machines and networked clusters. (It has been targetted to many sequential platforms as well.) High performance means that ZPL programs exhibit running times comparable to parallel programs written in C with programmer-specified interprocessor communication. ZPL is also very convenient, introducing the region concept as a replacement for the loops and indexing found in traditional imperative languages. The ZPL compiler performs aggressive code optimization, and pioneered machine independent communication optimizations based on the Ironman Interface. The ZPL language and compiler (released in July 1997) are available from the ZPL home page . Researchers are developing ZPL applications in many areas of science and engineering, including software libraries.

Advanced ZPL: A General Purpose Parallel Programming Language
(Snyder)
Advanced ZPL is the parent language of the recently released ZPL Array Language, a stand alone data parallel programming language. Advanced ZPL contains ZPL as a proper subset. It also includes additional facilities making it a general purpose parallel programming language. A-ZPL's features include support for irregular and sparse data structures, facilities for user specified data allocation and load balancing, pipelining, thread and task parallelism, etc. Advanced ZPL uses the compilation and run-time technology proved out in the ZPL implementation. These facilities demonstrated high performance and portability by building on an underlying abstract machine model (CTA) and an underlying programming model (Phase Abstractions). Critical research topics include the seamless integration of the data parallel facilities of ZPL with the new A-ZPL features, the extension of the region concept to control parallelism, and the exploitation of locality to reduce communications costs. Advanced ZPL's features are designed with close cooperation of scientists and engineers from applications disciplines.


Abstracts of Research in Information Retrieval, Database Systems, and Softbots


Internet Softbots
(Etzioni, Weld)
AI research is moving away from "toy tasks, such as block-stacking, towards more realistic problems. Building autonomous agents that interact with real-world software environments, such as the World Wide Web, is a pragmatically convenient, yet intellectually challenging, substrate for AI research. To support this claim, we are utilizing planning and machine-learning technology to develop several Internet softbots (software robots), customizable and (moderately) intelligent assistants for Internet access. The softbot accepts goals in a high-level language, generates and executes plans to achieve these goals, and learns from its experience. Custom-built execution and sensing modules enable the softbot to interact with the Web in real time.

MetaCrawler
(Etzioni) The MetaCrawler is a Web service that provides a single, central interface for Web document searching. Upon receiving a query, the MetaCrawler posts the query to multiple search engines in parallel, and performs sophisticated pruning on the responses returned. A commercial version of MetaCrawler is available at www.metacrawler.com. The research version at http://huskysearch.cs.washington.edu continues to be the platform for cutting-edge research on web search mechanisms including fast clustering algorithms for search engines results, and collaborative searching architectures.

Management of Semistructured Data
( A. Levy)
Several important applications involve large amounts of irregular semistructured data. Examples include the management of biological data, large volumes of online documentation, program libraries, and integration of data from multiple external sources. Currently, such data resides mostly in structured files, and basic database services such as querying and access control and transaction management are not supported. The main characteristics distinguishing semistructured data from traditional data are the absence of a fixed, predetemined schema, and the fact that the data is not strongly typed as in traditional models. Our research considers the management of semistructured data from several perspectives. First, we are developing appropriate models and query languages for such data. Second, we have developed novel algorithms for optimizing queries over semistructured data. Finally, we are also considering problems conerning the storage of semistructured data and architectures for uniformly combining structured and semistructured data. Our solutions are being tested and incorporated into the Strudel Web site management system, which is based on a semistructured data model.

Web Site Management
( A. Levy)
The World-Wide Web (WWW) has become a prime vehicle for disseminating information. As a result, the number of large Web sites with complex structure and that serve information derived from multiple data sources is increasing. Managing the content and the structure of such Web sites has raised a novel data management problem. We are developing the Strudel Web site management system which is the first system to apply concepts from database management systems to the problem of Web site management. The architecture of Strudel is based on separating three distinct tasks in building Web sites: (1) choosing and accessing the data that will be displayed at the site, (2) designing the site's structure, i.e., specifying the data contained within each page and the links between pages, and (3) designing the graphical presentation of pages. In particular, Strudel provides a mechanism for declaratively describing the structure and content of a Web site. As a result, several tasks that are otherwise very tedious to perform are supported very naturally in Strudel, such as (1) automatically updating a site, (2) restructuring a site, (3) creating multiple versions of a Web site and (4) enforcing integrity constraints on a site's structure. [an error occurred while processing this directive]

Data Integration
( A. Levy, Weld)
An information integration system provides a user with a uniform query interface to multiple information sources such as the thousands available on the Internet. As such, the system frees the user from having to locate the data sources relevant to a query, query each source in isolation, and manually combine the information from the different sources. Our research considers several problems related to data integration systems. First, we have developed methods for describing the contents, capabilities and completeness of data sources. Second, we are developing methods for automatically creating robust and efficient query execution plans for user queries. Our work develops novel techniques in the areas of planning, database query optimization and knowledge representation.

HCI Issues in Web Site Management
( Borning, A. Levy)
Building tools for Web site design and management raises a number of interesting challenges in human-computer interaction and data management. The web site design and maintenance task can be seen as consisting of three subtasks: data management, structure management and graphical presentation. The work on Strudel has demonstrated the value of this decomposition, as well as providing powerful declarative tools for data and structure management. We are now investigating user interface metaphors to make these tools more accessible to designers, along with interfaces that embody these metaphors. Also, we are investigating whether the HTML generation languages that support the third subtask -- graphical presentation -- can incorporate additional declarative tools as well, namely constraints on possible layout of the pages, as well as specifications of how the layout is to be determined from the structure. At a more fundamental level we are developing constraint languages and efficient constraint solvers that incorporate both layout constraints and structural constraints.


Abstracts of Research in Software Engineering


Symbolic Model Checking of Software Specifications
(R. Anderson, Beame, Notkin)
Errors in software specifications cost money and, in some cases, threaten lives. Symbolic model checking based on binary decision diagrams (BDDs) is an efficient automatic verification technique that has been applied successfully to many hardware designs but only in a very limited way to software specifications. We apply this technique to large safety-critical software specifications. We show how to provide effective input encodings to symbolic model checkers and tune their algorithms to handle specifications given as communicating hierarchical state machines. Although BDD-based symbolic model checking typically is efficient, it can bog down when dealing with non-linear arithmetic that is prevalent in some software specifications. We develop extensions of standard model-checking techniques that hold the promise of circumventing this non-linear arithmetic bottleneck.

Constraint-Based Systems
(Borning)
A constraint describes some relation that we would like to keep maintained, for example, that a column in a table in a web page is at least 20 pixels wide, that the background color in a document becomes more yellow with age, or that window w1 is twice as high as window w2. In recent research we developed a number of fast and expressive constraint solvers for interactive graphical applications. We are now applying these solvers to web page layout, including html extensions to allow the page author to express constraints, and to interactive graphical editors. For more information please see the web pages for Constraint-Based Systems.

Program Restructuring
(Notkin)
Software maintenance tends to become more difficult as a system evolves, in part because the structure of the system degrades. Restructuring the system to overcome such entropy is one approach to reducing maintenance costs. However, restructuring does not make visible "progress" and it is itself error-prone. We are investigating an alternative approach, meaning-preserving restructuring. In this approach, the software engineer applies structural changes and a tool then finds a set of global, compensating changes that (together with the engineer's explicit change) will guarantee that the meaning of the program remains unchanged.

Reflection Models
(Notkin)
Given an existing software system and a task to accomplish, software engineers decide whether and how to proceed based on experience, judgment, and system-specific knowledge. For even modest-sized software systems, there is a substantial risk of making poor decisions, because it is difficult (if not impossible) for software engineers to maintain adequate knowledge of the source. To manage this risk, engineers often use high-level models to focus on particular aspects of the system relevant to the task. These models help the engineer reduce risk only if they are sufficiently accurate representations of the actual system. We have developed an approach that may help engineers determine whether their task-specific models are "good enough" to make informed decisions. An engineer defines a high-level structural model appropriate for the task at hand and also specifies how the model maps to the source. A tool then computes a reflexion model that shows the engineer where the high-level model is consistent with and differs from the source. Analyzing the reflection model and applying personal experience and judgment, an engineer can then assess the risk of proceeding with the task.

Object-Oriented Frameworks
(Notkin)
Frameworks are one of several approaches to exploiting the commonalities of a family of software systems. They have been used successfully in user interfaces, network management, and operating systems. Applying framework operations, such as instantiating a framework to a specific application, can be error-prone. This is, in part, because any errors in the operation are not found until the application itself is executed. We are exploring approaches to reducing these problems (and thus increasing the benefits of using frameworks) through the use of static typing in object-oriented languages and tools to help execute these framework operations.

Implicit Invocation Mechanisms
(Notkin)
Implicit invocation (also known as event-based) mechanisms allow software components to announce interesting events that cause the invocation of other components that have registered interest in those events. When carefully applied, implicit invocation can be used to structure software systems that are more easily integrated and that are more easy to evolve. We are exploring effective design approaches using implicit invocation, as well as the design space for implicit invocation mechanisms (including language-based and domain-specific mechanisms).

Source Model Extraction
(Notkin)
Reverse engineers depend on the automatic extraction of information from source code. Some useful kinds of information -- source models -- are well-known: call graphs, file dependences, etc. Predicting every kind of source model that a reverse engineer may need is impossible. We have developed a lightweight approach for generating flexible and tolerant source model extractors from lexical specifications. We have also performed an empirical study of a set of C call graph extractors. These extractors return different, often significantly different, call graphs when applied to the same programs.

Real-Time Systems
(Shaw)
Our research has been concerned with models, techniques, and tools for specifying, analyzing, and generating software for real-time applications. The emphasis is on requirements and design specifications; fast prototyping; real-time operating systems; real-time programming; and methods for describing, predicting, and verifying the timing behavior of programming languages and systems. Current research includes: using execution-time bounds and assertional logic to state and prove timing properties of higher-level language programs; building software tools that predict the deterministic execution times of programs; refining, implementing, and testing a specification method, called communicating real-time state machines; and studying the use of time-stamped event histories in real-time programming. Two themes of the research are the incorporation of time as a first-class specification and programming object, and the analysis and exploitation of concurrency.


Abstracts of Research in Computer Graphics and Computer Vision


3D Scanning, Reconstruction, and Rendering
(Curless, Duchamp (Mathematics),Stuetzle (Statistics))
3D scanning is a powerful tool for acquiring the shape and appearance of unique physical objects, and for bringing the creative design process from the sculptor into the computer. Capturing the properties of existing objects has driving applications in reverse engineering, inspection, virtual dissemination of museum artifacts, and anatomical modeling for medicine. In addition, when desiging new objects such as automobiles or synthetic movie characters, clay is frequently a superior "interface" than most computer-aided geometric design systems; 3D scanning provides the means of bringing the resulting shape into the computer. In our research, we are concerned with all phases of 3D scanning. We are developing new methods for acquiring high resolution shape and color data. We are exploring ways of reconstructing useful computer models from this data. And we are devising algorithms that enable real-time interaction with these models by taking advantage of current trends in 3D graphics hardware development.

Multiresolution Methods in Computer Graphics
(Salesin)
We are exploring the application of multiresolution methods, such as "wavelets," to problems in computer graphics. These graphics applications include image editing and compression, automatic level-of-detail control for editing and rendering curves and surfaces, surface reconstruction from contours, and fast methods for solving simulation problems in global illumination and animation. Our results to date are described in the research monograph, Wavelets for Computer Graphics: Theory and Applications, and in the following papers: