Embedded System Synthesis
(Borriello)
Automated synthesis of the hardware and software elements of embedded
controllers is critical to rapidly evaluating the many design choices
faced by designers of task-specific computing and communication devices.
We are currently developing a synthesis system, as part of the Chinook
project, that will automatically generate many of the low-level software
needed to coordinate the elements of an embedded system including detailed
scheduling to meet real-time constraints. Thus, our principal interest
is to make embedded software more portable by automatically generating
the most error-prone and system-specific portions of the software. This
includes the automatic customization of generic device drivers and
application-specific, real-time kernels to execute the software on
multiple processors.
Component-Based Design
(Borriello)
Modern software design involves the reuse of a variety of existing components.
The elements encapsulate functionality associated with a particular task
such as: a web browser, management of a wireless connection, user interface
widgets, etc.. Currently, these components are provided to designers via
an API abstraction for a particular platform or system architecture. This
style supports data-composition well (i.e., the transfer of data packets in a
data-flow style) but does not do as well for control-composition (where
the internal state changes of one component must also effect others). We are
developing a specification methodology that supports both data and control
composition at a high-level of abstraction and supports the automatic
synthesis of all the coordinating code (that handles communication,
synchronization, and consistency between the components). Our goal is to
be make software components more easily reusable and retargetable and
optimize them to different system architectures (e.g., use the same
specification to generate a family of applications including those that may
run solely on a workstation, as part of an embedded system, or distributed
over a network). We are currently developing an integrated development
environment that supports this style of specification and software design.
Information Appliances and Wearable Computing
(Borriello)
The future of computing, for the majority of people, is not going to
be on the desktop. Rather, it will be in devices unobtrusively embedded
in their homes, cars, and carried on their person. There are interesting
issues in constructing such devices that related to their connectivity,
user input/output modalities, and power sources. We are actively working
on speech recognition and synthesis (as may be used in a cellular phone
that connects to information services on the web) and on uses of small
personal digital assistants in home and automobile environments (for
example, to remote control home appliances and utilize GPS information
for coordinating routes and modes of transportation). Our aim is to
prototype interesting new devices and use them as drivers for our work
in component-based design and embedded system synthesis.
Silicon Neuroscience
(Diorio)
Nervous tissue (e.g. the brain) computes using hardware and
algorithms that are fundamentally different from our digital
computers. Our research group believes that we can learn new
principles of computing by studying neurobiology. Our goal is
to investigate the foundations of neuronal computation, and to
build silicon integrated circuits that compute using these
principles. We call our approach silicon neuroscience: the
development of neurally inspired silicon-learning systems.
We have already developed, in a standard CMOS process, a
family of single-transistor devices that we call synapse
transistors. Like neural synapses, synapse transistors
learn locally. And they do so at biologically realistic
timescales. Although we do not believe that a single
transistor can model the complex behavior of a neural
synapse completely, our synapse transistors do learn
from their inputs. Using them, we intend to develop
silicon systems that learn from their environment.
Pipelined-Systolic RADAR Signal Processing
(Diorio,
Sahr
(Electrical Engineering))
Researchers in the Radar Remote Sensing Laboratory
(rcs.ee.washington.edu/SPP/)
are building
a passive, bistatic radar for real-time measurements
of ionospheric fluctuations. This RADAR requires a
processing throughput of 10-100GOPS for a two-receiver
system; the computational burden increases as the square
of the number of receivers (additional receivers enable
angle-of-arrival estimation and target imaging).
We are optimizing the signal-processing algorithms
for integrated circuit implementation, and we are
developing custom, pipelined-systolic CMOS ICs with
the requisite throughput to enable real-time
ionospheric measurement.
Action-Potential Based Computing
(Diorio,
Atlas
(Electrical Engineering))
Conventional digital-logic systems communicate using
discrete, binary voltages, transmitted on passive metal
wire, and are synchronized by a system clock. By contrast,
neurons, the brain's primary computing elements, communicate
using millisecond-long impulse-like voltage waveforms
called action potentials, transmitted on nondispersive
active wire (called axon), and are not clocked. Recent
neurophysiological data lends credence to computing models
that use action-potential timing, and tuned wire delays,
to enable self-timed signal processing. We are investigating
the relationship between delay-based computing and
time-frequency analysis, and are beginning to develop
hardware models for computing using delta-function
impulses and wire delays.
A Configurable ASIC Compiler for Compute-Intensive Applications
(Ebeling)
ASICs can provide a large price-performance advantage over DSPs for
many compute-intensive applications. But ASIC technology
is not appropriate for applications that present a moving target in
terms of evolving standards, multiple uses, or algorithmic
improvements. For such applications, the implementation must be
reprogrammable to avoid premature obsolescence.
We are working on a hardware compiler that generates configurable
ASICs from high-level descriptions of compute-intensive tasks. A
configurable ASIC uses a combination of programmed control and static
reconfiguration of hardware resources to achieve flexibility without
compromising performance. Depending on the intended application, the
configurable ASIC can be built with more or less flexibility depending
on the required range of in situ reprogrammability.
Field-Programmable Gate Arrays
(Ebeling)
Reconfigurable hardware is rapidly increasing in popularity and complexity.
Our research in FGPAs falls in three different categories.
First, we have developed and patented a new FPGA architecture,
called Triptych, that achieves 2 to 4 times higher logic density than
existing commercial FPGAs.
Second, we have developed a set of architecture-independent,
performance-driven mapping tools for exploring new FPGA architectures.
These tools have been retargeted to a new IBM FPGA architecture,
where they have achieved better performance and density than industry tools.
Third, we are now investigating a new coarse-grained FPGA architecture
intended for computation-intensive applications,
such as digital signal processing.
We expect to achieve an order of magnitude improvement in density over
current FPGA architectures.
Architectural Retiming
(Ebeling)
Pipelining is probably the most common technique for improving system
performance, and retiming can be used to automatically optimize the
performance obtained by pipelining.
However, in many situations pipelining and retiming are not possible,
either because of latency constraints or because a feedback cycle makes
pipelining impossible.
We are investigating a set of techniques that allow pipelining in these
cases by changing the architecture of the circuit.
Whereas retiming preserves the function of the circuit exactly on a cycle
by cycle basis,
changing only the position of registers in the circuit,
architectural retiming relies on logic transformations,
preserving functionality only in terms of input/output sequences.
We are developing a CAD tool called ART which applies these transformations,
relying initially on the designer to evaluate and choose those that achieve
the best performance.
Chaos Routing
(Snyder,
Ebeling, Bolding)
The Chaos router is a randomizing, non-minimal adaptive packet router
that has been shown to have higher throughput and lower latency
than the state of the art oblivious routers for both two dimensional
(e.g., mesh and torus) networks and hypercube networks
on uniform and non-uniform traffic patterns.
Randomization is crucial,
since it allows the Chaos router to be naturally livelock free
without any special circuitry
(i.e., packets do not circulate forever).
The Chaos router has been implemented in 1.2u CMOS with a speed of 66MHz,
the speed limit for this technology, using a pipelined design.
Research has included processor/network interface designs,
characterizations of Chaos routing under various workloads,
studies of fault tolerance issues in adaptive routing,
theoretical characterizations of routing algorithms,
and a general study of high performance routing algorithms.
Currently, the 0.6u Chaos chip is back from fab, and being
incorporated into a Chaos LAN design.
Click
here for more information.