University of Washington
Computer Science & Engineering

Abstracts of Research in Embedded Systems and Reconfigurable Computing


Embedded System Synthesis
(Borriello)
Automated synthesis of the hardware and software elements of embedded controllers is critical to rapidly evaluating the many design choices faced by designers of task-specific computing and communication devices. We are currently developing a synthesis system, as part of the Chinook project, that will automatically generate many of the low-level software needed to coordinate the elements of an embedded system including detailed scheduling to meet real-time constraints. Thus, our principal interest is to make embedded software more portable by automatically generating the most error-prone and system-specific portions of the software. This includes the automatic customization of generic device drivers and application-specific, real-time kernels to execute the software on multiple processors.

Component-Based Design
(Borriello)
Modern software design involves the reuse of a variety of existing components. The elements encapsulate functionality associated with a particular task such as: a web browser, management of a wireless connection, user interface widgets, etc.. Currently, these components are provided to designers via an API abstraction for a particular platform or system architecture. This style supports data-composition well (i.e., the transfer of data packets in a data-flow style) but does not do as well for control-composition (where the internal state changes of one component must also effect others). We are developing a specification methodology that supports both data and control composition at a high-level of abstraction and supports the automatic synthesis of all the coordinating code (that handles communication, synchronization, and consistency between the components). Our goal is to be make software components more easily reusable and retargetable and optimize them to different system architectures (e.g., use the same specification to generate a family of applications including those that may run solely on a workstation, as part of an embedded system, or distributed over a network). We are currently developing an integrated development environment that supports this style of specification and software design.

Information Appliances and Wearable Computing
(Borriello)
The future of computing, for the majority of people, is not going to be on the desktop. Rather, it will be in devices unobtrusively embedded in their homes, cars, and carried on their person. There are interesting issues in constructing such devices that related to their connectivity, user input/output modalities, and power sources. We are actively working on speech recognition and synthesis (as may be used in a cellular phone that connects to information services on the web) and on uses of small personal digital assistants in home and automobile environments (for example, to remote control home appliances and utilize GPS information for coordinating routes and modes of transportation). Our aim is to prototype interesting new devices and use them as drivers for our work in component-based design and embedded system synthesis.

Silicon Neuroscience
(Diorio)
Nervous tissue (e.g. the brain) computes using hardware and algorithms that are fundamentally different from our digital computers. Our research group believes that we can learn new principles of computing by studying neurobiology. Our goal is to investigate the foundations of neuronal computation, and to build silicon integrated circuits that compute using these principles. We call our approach silicon neuroscience: the development of neurally inspired silicon-learning systems. We have already developed, in a standard CMOS process, a family of single-transistor devices that we call synapse transistors. Like neural synapses, synapse transistors learn locally. And they do so at biologically realistic timescales. Although we do not believe that a single transistor can model the complex behavior of a neural synapse completely, our synapse transistors do learn from their inputs. Using them, we intend to develop silicon systems that learn from their environment.

Pipelined-Systolic RADAR Signal Processing
(Diorio, Sahr (Electrical Engineering))
Researchers in the Radar Remote Sensing Laboratory (rcs.ee.washington.edu/SPP/) are building a passive, bistatic radar for real-time measurements of ionospheric fluctuations. This RADAR requires a processing throughput of 10-100GOPS for a two-receiver system; the computational burden increases as the square of the number of receivers (additional receivers enable angle-of-arrival estimation and target imaging). We are optimizing the signal-processing algorithms for integrated circuit implementation, and we are developing custom, pipelined-systolic CMOS ICs with the requisite throughput to enable real-time ionospheric measurement.

Action-Potential Based Computing
(Diorio, Atlas (Electrical Engineering))
Conventional digital-logic systems communicate using discrete, binary voltages, transmitted on passive metal wire, and are synchronized by a system clock. By contrast, neurons, the brain's primary computing elements, communicate using millisecond-long impulse-like voltage waveforms called action potentials, transmitted on nondispersive active wire (called axon), and are not clocked. Recent neurophysiological data lends credence to computing models that use action-potential timing, and tuned wire delays, to enable self-timed signal processing. We are investigating the relationship between delay-based computing and time-frequency analysis, and are beginning to develop hardware models for computing using delta-function impulses and wire delays.

A Configurable ASIC Compiler for Compute-Intensive Applications
(Ebeling)
ASICs can provide a large price-performance advantage over DSPs for many compute-intensive applications. But ASIC technology is not appropriate for applications that present a moving target in terms of evolving standards, multiple uses, or algorithmic improvements. For such applications, the implementation must be reprogrammable to avoid premature obsolescence. We are working on a hardware compiler that generates configurable ASICs from high-level descriptions of compute-intensive tasks. A configurable ASIC uses a combination of programmed control and static reconfiguration of hardware resources to achieve flexibility without compromising performance. Depending on the intended application, the configurable ASIC can be built with more or less flexibility depending on the required range of in situ reprogrammability.

Field-Programmable Gate Arrays
(Ebeling)
Reconfigurable hardware is rapidly increasing in popularity and complexity. Our research in FGPAs falls in three different categories. First, we have developed and patented a new FPGA architecture, called Triptych, that achieves 2 to 4 times higher logic density than existing commercial FPGAs. Second, we have developed a set of architecture-independent, performance-driven mapping tools for exploring new FPGA architectures. These tools have been retargeted to a new IBM FPGA architecture, where they have achieved better performance and density than industry tools. Third, we are now investigating a new coarse-grained FPGA architecture intended for computation-intensive applications, such as digital signal processing. We expect to achieve an order of magnitude improvement in density over current FPGA architectures.

Architectural Retiming
(Ebeling)
Pipelining is probably the most common technique for improving system performance, and retiming can be used to automatically optimize the performance obtained by pipelining. However, in many situations pipelining and retiming are not possible, either because of latency constraints or because a feedback cycle makes pipelining impossible. We are investigating a set of techniques that allow pipelining in these cases by changing the architecture of the circuit. Whereas retiming preserves the function of the circuit exactly on a cycle by cycle basis, changing only the position of registers in the circuit, architectural retiming relies on logic transformations, preserving functionality only in terms of input/output sequences. We are developing a CAD tool called ART which applies these transformations, relying initially on the designer to evaluate and choose those that achieve the best performance.

Chaos Routing
(Snyder, Ebeling, Bolding)
The Chaos router is a randomizing, non-minimal adaptive packet router that has been shown to have higher throughput and lower latency than the state of the art oblivious routers for both two dimensional (e.g., mesh and torus) networks and hypercube networks on uniform and non-uniform traffic patterns. Randomization is crucial, since it allows the Chaos router to be naturally livelock free without any special circuitry (i.e., packets do not circulate forever). The Chaos router has been implemented in 1.2u CMOS with a speed of 66MHz, the speed limit for this technology, using a pipelined design. Research has included processor/network interface designs, characterizations of Chaos routing under various workloads, studies of fault tolerance issues in adaptive routing, theoretical characterizations of routing algorithms, and a general study of high performance routing algorithms. Currently, the 0.6u Chaos chip is back from fab, and being incorporated into a Chaos LAN design. Click here for more information.