UW Cecil Group : Past Research

NOTE: Current research projects are described on the WASP Group's web pages. Below are descriptions of past research projects.



Cobalt: Provably Correct Compiler Optimizations

Cobalt is a domain-specific language for expressing compiler optimizations. By providing natural abstractions for declaratively specifying compiler optimizations, Cobalt makes it easier for humans to write and reason about optimizations. But more importantly, Cobalt makes it possible for a computer to effectively understand, process and reason about optimizations. Here are some examples of goals we hope to achieve once optimizations are expressed in Cobalt:

All of these goals will provide support not only for compiler writers but also for programmers who want to extend the compiler.

Note: Cobalt has been superceded by Rhodium.

Composable Dataflow Analyses

The goal of this work is to ease the task of writing compilers. One of the problems facing compiler writers is the phase ordering problem: dataflow analyses can interact in mutually beneficial ways, and no single order of running the analyses will exploit all the mutually beneficial interactions. Previous efforts to exploit these interactions have either (1) iteratively performed each individual analysis until no further improvements are discovered or (2) developed ``super-analyses'' that manually combine conceptually separate analyses. We have devised a new approach that allows analyses to be defined independently while still enabling them to be combined automatically and profitably. Our approach avoids the loss of precision associated with iterating individual analyses and the implementation difficulties of manually writing a super-analysis. We have implemented our framework in the Vortex compiler using a control flow graph representation and in the Whirlwind compiler using a dataflow graph representation.

Modular Extensibility: MultiJava and EML

Language support for reusable components fosters a programming style based on combining existing abstractions rather than building programs from scratch. For such a programming style to be practical, programmers must be able to extend components easily in client-specific ways, without modifying the original components. This project aims to address the extensibility limitations of components in traditional languages, while maintaining the ability to modularly typecheck and compile such components.

MultiJava

MultiJava is a backward-compatible extension to Java that addresses some of the extensibility limitations of traditional object-oriented languages. Open classes allow one to add to the set of methods that an existing class supports without editing that class or its clients, obviating the need for tedious workarounds like the "visitor" design pattern. Multimethods generalize traditional receiver-based dynamic dispatching by allowing all arguments to be uniformly dispatched upon, resolving an asymmetry that makes some common idioms difficult to program. MultiJava retains Java's class-based encapsulation properties as well as its modular typechecking and compilation schemes. An implementation of MultiJava is available for download.

EML

Extensible ML (EML) is an ML-like language that adds support for object-oriented idioms in a functional setting. Rather than adding a new set of constructs to ML, EML simply generalizes ML-style datatypes and functions to be hierarchical and extensible, allowing class hierarchies and OO-style methods to be seamlessly integrated with the traditional functional style.

Note: work on this class of languages continues with the F(EML) language and the Diamond project.

Software Architecture: ArchJava

ArchJava is an extension to Java that allows programmers to express the software architecture of a program within the source code. ArchJava's type system ensures that the implementation code obeys the constraints imposed by the architecture. Our initial experience in two case studies suggests that ArchJava can express architectural structure effectively within an implementation, and that it may aid program understanding and software evolution tasks. ArchJava incorporates AliasJava, an extension to Java with an ownershiptype system for controlling aliasing.

Hydro: Evolvable Software

Hydro is our umbrella project for research in programming systems support for evolvable ubiquitous computing systems. Ubiquitous computing systems, as a class of distributed systems, present stiff challenges for application programmers: such systems must run continuously on long time scales, handling failure and incremental system evolution while remaining robust. We intend to use language and program analysis technology to help programmers cope with these challenges. Our current work includes HydroJ, a language design for semi-structured message passing, that supports loosely coupled system construction. In future work, we will also investigate applying program analysis techniques to the verification of temporal and behavioral properties of such systems as they evolve over time. This is joint work in collaboration with the Intel Research Laboratory @ Seattle and the Portolano Group at the University of Washington.

Staged/Dynamic Compilation

Information relevant to program optimization is available over many stages in the life of a program. For instance, much information about individual functions is available at separate compile time, that about groups of functions is available at library assembly time, that about a whole program may be available at link-time and that about particular executions of a program may be available at run-time. In order to exploit all this information, a conventional compiler would have to wait until the last stage when information is available before commencing execution. Unfortunately, the later the stage, the faster the compilation needs to be, and the more difficult it becomes to engineer the compiler to produce high-quality code while remaining fast.

Staged compilation is a technique that allows much of the optimization to happen in earlier stages based on the partial inputs available in those stages, leaving less work for the later stages. Our work thus far has focused on producing very fast compilers for the run-time stage. We have succeeded in producing staged compiler pipelines that perform intra-procedural optimization up to an order of magnitude faster than their unstaged versions. We are now working on staging whole-program optimizations.

Note: current research on staged compilation now takes place in the context of Whirlwind.

Cecil

Cecil is a purely object-oriented language intended to support rapid construction of high-quality, extensible software. Cecil incorporates multi-methods, a simple prototype-based object model, a mechanism to support a structured form of computed inheritance, module-based encapsulation, and a flexible static type system which allows statically- and dynamically-typed code to mix freely. An implementation of Cecil is available for download.

Note: Cecil has been superceded by Diesel.

Vortex

Vortex is an optimizing compiler infrastructure for object-oriented and other high-level languages. It targets both pure object-oriented languages like Cecil, Diesel, and Smalltalk and hybrid object-oriented languages like C++, Modula-3, and Java. Vortex currently incorporates high-level optimizations such as static class analysis, class hierarchy analysis, profile-guided receiver class prediction, profile-guided selective procedure specialization, intraprocedural message splitting, automatic inlining, and static closure analyses. It also includes a collection of standard intraprocedural analyses such as common subexpression elimination and dead assignment elimination. The Vortex compiler is written entirely in Cecil. Vortex is available for download.

Note: current research and development of advanced optimizations and compilation techniques now takes place in the context of Whirlwind.