Break-out Topics and Talks | Paul G. Allen School of Computer Science & Engineering

Thursday, February 26, 2004

SESSION I 9:35 - 10:50	Networking Tom Anderson CSE 305	Programming Languages, Compilers and Software Engineering Dan Grossman CSE 403	Artificial Intelligence Henry Kautz CSE 691
SESSION II 11:00 - 12:15	Systems Hank Levy CSE 305	Data Management Alon Halelvy CSE 403	HCI Richard Anderson CSE 691
SESSION III 3:00 - 4:15	Architecture and Hardware Mark Oskin CSE 305	Theory Anna Karlin CSE 403	Graphics and Vision Steve Seitz CSE 691
SESSION IV 4:30 - 5:45	Computational Biology Martin Tompa CSE 305	Compression Richard Ladner CSE 403	Ubiquitous Computing Gaetano Borriello CSE 691

Session I

Networking: Tom Anderson

Upgrading Transport Protocols Using Mobile Code (Powerpoint Slides)
Andrew Whitaker

"In this talk, I'll present STP, a system in which communicating end hosts can use untrusted mobile code to remotely upgrade each other's network transport protocol. New transport protocols are written in a type-safe version of C, distributed out-of-band, and run in-kernel. Communicating peers select a transport protocol to use as part of a TCP-like connection setup handshake that is backwards-compatible with TCP and incurs minimum connection setup latency. New transports can be invoked by unmodified applications. By providing a late binding of protocols to hosts, STP removes many of the delays and constraints due to backwards-compatibility that are otherwise commonplace when upgrading Internet transport protocols."
Deterring Selfish Behavior in Multi-Hop Wireless Networks
Maya Rodrig

"While the connectivity of wireless networks can be enhanced by multi-hop routing, the packet forwarding required uses resources some nodes may be unwilling to provide. Traditional wireless routing protocols merely assume that these resources will be given, and so are vulnerable to selfish nodes that consume network resources but refuse to contribute to them. We present Catch, a protocol that addresses this vulnerability by employing a novel technique based on anonymous messages to detect and punish selfish behavior. We evaluate Catch using both simulation and an 802.11 wireless testbed, and find that it effectively deters selfish behavior."
Routing in a Delay Tolerant Network (Powerpoint Slides)
Sushant Jain

"In this talk, we formulate and motivate the delay-tolerant networking routing problem, where messages are to be moved end-to-end across a connectivity graph that is time varying but whose dynamics may be known in advance. Such networks have the general property that no end-to-end path may ever exist which limits the applicability of traditional routing approaches. In practical terms, DTNs arise in networks with periodic connectivity e.g. Low-Earth Orbiting Satellites (LEO), sensor networks with disconnection arising because of scheduled power outage or those with unpredicted, opportunistic connectivity e.g. communication between mobile PDAs. We present a framework for evaluating different routing algorithms and show a simulation based comparison of several of our own algorithms in this framework."
Interdomain Routing with Negotiation
Ratul Mahajan

"Current interdomain routing policies are largely based on information local to each ISP, in part due to competitive concerns and the lack of effective information sharing mechanisms. This can lead to routes that are sub-optimal and even to routes that oscillate. We explore a setting in which two neighboring ISPs negotiate to determine the path of the traffic they exchange. We first ask the basic question: is there an incentive to negotiate? The incentive exists only if both ISPs benefit relative to routing based on local information. Through simulation with over sixty measured ISP topologies, we find that negotiation is useful for both latency reduction and hotspot avoidance. Interestingly, we find that global optimization is undesirable in the sense that one ISP often suffers to benefit the other. Based on our results, we design and evaluate a negotiation protocol which works within the real-world constraints of competing and independently managed ISPs. Specifically, our protocol reveals little information and works even when ISPs have different optimization criteria. We find that it achieves routing performance comparable to that of global optimization using complete information from both ISPs."

Programming Languages, Compilers and Software Engineering: Dan Grossman

Design Decision Support via Lightweight Analysis of Code
Vibha Sazawal

"A common problem that arises in software engineering is the failure to meet "design criteria" of interest, even though the intention to meet such criteria existed at the onset of the project. Traditional software engineering design criteria include ease of understanding, ease of change, and robustness. Other criteria popular today include reusability, simplicity, and testability. In this talk, I will propose ways that tools can help software engineers meet design criteria. I will also introduce a tool I have built, which helps software engineers with one criterion: ease of change. The tool, designed as a plug-in to the Eclipse Java IDE, automatically generates lightweight visualizations that help a programmer diagnose problems such as over-coupling and lack of information hiding. The intent is to encourage iteration and guide extensions to existing software by helping programmers understand the design details latent in their code."
Ethnographic Study on Copy and Paste Programming Practice
Miryung Kim

"A common way of developing and evolving software is by cutting and pasting code from an existing code base, or sources such as web pages or documentation. We conducted an ethnographic study in order to verify our hypothesis that programmers follow a number of common copy and paste usage patterns. Based our observations, we built a taxonomy of copy and paste usage patterns. In this talk, we will present a taxonomy of copy and paste usage patterns and discuss our initial findings with examples drawn from our observations. We will propose a set of tools that limit software maintenance problems caused by copy and paste, as well as better-supporting the intent of many copy and paste programming practices."
Partial Class Analysis
Andrei Alexandrescu

"Class hierarchy analysis improves the quality of of static class information by examining the complete inheritance graph of a program. However, both separate compilation and dynamic loading of classes or libraries make it impossible to apply class hierarchy analysis because the compiler does not have access to the entire inheritance graph, but rather only to a fragment of it. We describe partial class hierarchy analysis, a generalization of class hierarchy analysis that can work even in the face of partially known class hierarchies. Partial class hierarchy analysis can use abstract constraints on the unseen part of the class hierarchy in order to perform optimizations without ever requiring complete knowledge of the program's entire class hierarchy. Partial class hierarchy analysis is particularly suitable for multi-lingual compilation systems."
Cyclone: Safe Programming at the C Level of Abstraction (Powerpoint Slides)
Dan Grossman

"Memory safety and type safety are invaluable features for constructing robust software. However, most safe languages are at a high level of abstraction; programmers cede almost all control over data representation and memory management. This control is one reason C remains the de facto standard for writing systems software or extending legacy systems already written in C. This talk will give the flavor of my work on designing a language (called Cyclone) combining safety C's user-control over data-representation and resource management. I will also sketch future research directions for language design and implementation for low-level applications."

Artificial Intelligence: Henry Kautz

Using Dynamic Bayesian Networks and RFID Tags to Infer Human Behavior (Powerpoint Slides)
Don Patterson

"We present a probabilistic behavior recognition model that can be used to segment continuous data and distinguish between dozens of different activities. The model uses a natural semantic specification of partially-ordered activities. In order to provide robust and general activity descriptions certain parameters of the underlying probabilistic model are deliberately left unspecified. However, we show how inference on a dynamic Bayesian network of the model can be efficiently supported by suspending complete state estimation until queried. We present results on recognizing the performance of activities of daily living (such as cooking and cleaning) in a real home, where data is gathered using wearable RFID tag readers to detect the physical manipulation of household objects."
Learning and Inferring Transportation Routines (Powerpoint Slides)
Lin Liao

"We introduce a hierarchical model that can learn and infer a user's daily movements through the community. The model uses multiple levels of abstraction in order to bridge the gap between raw GPS sensor measurements and high level information such as a user's mode of transportation or her goal. We apply Rao-Blackwellised particle filters for efficient inference both at the low level of the model and at the higher levels of the hierarchy. Significant locations such as goals or locations where the user frequently changes mode of transportation are learned from GPS data logs without requiring any manual labeling. We show how to detect user errors by concurrently tracking his activities with a trained and an untrained model. Extensive experiments show that our model is able to accurately predict the goals of a person and to recognize situations in which the user performs unknown activities."
Dynamic Probabilistic Relational Models
Sumit Sanghai

"Stochastic processes that involve the creation and modification of objects and relations over time are widespread, but relatively poorly studied. For example, accurate fault diagnosis in factory assembly processes requires inferring the probabilities of erroneous assembly operations, but doing this efficiently and accurately is difficult. Modeled as dynamic Bayesian networks, these processes have discrete variables with very large domains and extremely high dimensionality. Recently, Sanghai et al. (2003) introduced dynamic probabilistic relational models (DPRMs) to model probabilistic relational domains that change with time. They also proposed a form of Rao-Blackwellized particle filtering which performs well on problems of this type, but relies on very restrictive assumptions. In this paper we lift these assumptions, developing two forms of particle filtering that are in principle applicable to any relational stochastic process. The first one uses an abstraction lattice over relational variables to smooth the particle filter's estimates. The second employs kernel density estimation using a kernel function specifically designed for relational domains. Experiments show these two methods greatly outperforms standard particle filtering on the task of assembly plan execution monitoring."

Session II

Systems: Hank Levy

Constructing Services with Interposable Virtual Hardware (PDF Slides)
Rick Cox

"Virtual machine monitors (VMMs) provide useful support for many novel applications beyond their traditional role of isolating virtual machines (VMs) from one another. These virtual machine services, including migration, intrusion detection, and trusted computing, can operate at a level below the operating system, where fewer abstractions interfere with their purpose. We will discuss a new platform, the virtual device monitor (VDM), that enables virtual machine services to be constructed easily. VDMs expose virtual devices and a virtual bus interconnecting them, enabling virtual machine services services to be implemented without modifying the VDM."
Measurement, Modeling and Analysis of a Peer-to-Peer File-Sharing Workload
Krishna Gummadi

"Peer-to-peer (P2P) file sharing accounts for an astonishing volume of current Internet traffic. This work probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for future multimedia workloads. Our research uses a three-tiered approach. First, we analyze a 200-day trace of over 20 terabytes of Kazaa P2P traffic collected at the University of Washington. Second, we develop a model of multimedia workloads that lets us isolate, vary, and explore the impact of key system parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about file-sharing behavior observed in the trace. Third, we explore the potential impact of locality-awareness in Kazaa.

Our results reveal dramatic differences between P2P file sharing and Web traffic. For example, we show how the immutability of Kazaa's multimedia objects leads clients to fetch objects at most once; in contrast, a World-Wide Web client may fetch a popular page (e.g., CNN or Google) thousands of times. Moreover, we demonstrate that: (1) this ``fetch-at-most-once'' behavior causes the Kazaa popularity distribution to deviate substantially from Zipf curves we see for the Web, and (2) this deviation has significant implications for the performance of multimedia file-sharing systems. Unlike the Web, whose workload is driven by document change, we demonstrate that clients' fetch-at-most-once behavior, the creation of new objects, and the addition of new clients to the system are the primary forces that drive multimedia workloads such as Kazaa. We also show that there is substantial untapped locality in the Kazaa workload. Finally, we quantify the potential bandwidth savings that locality-aware P2P file-sharing architectures would achieve."
Hippocrates: A System for Coping with Network Intrusions
Marianne Shaw

"We present the design and implementation of Hippocrates, a system that contains the damage caused by successful compromise of an application. Hippocrates takes the stance that, for the foreseeable future, applications will likely contain vulnerabilities. As a result, we limit the scope of what an attacker can affect within a machine after exploiting a bug in an application, and we impede the attacker from using the compromised machine as a stepping stone for mounting further attacks. To accomplish this, we use a novel combination of existing technologies: we isolate each application within its own virtual machine, and we use application-granularity firewalls to restrict the kind of traffic a compromised VM is able to generate.

Our system supports running legacy applications and operating systems. After presenting the design of Hippocrates, we discuss our experiences in running three NetBSD applications within our system: the Apache web server, the lynx web browser, and the ``mutella'' peer-to-peer file sharing application. We discuss the firewall policies that we derived to contain these applications, and demonstrate how using application-granularity firewalling leads to much more effective and simple firewall policies than host- or enterprise-granularity firewalls."
Storage-Oriented Data Delivery (PDF Slides)
Paul Gauthier

"We introduce the model of Storage-Oriented Data Delivery (SODD) in order to overcome the bandwidth and latency limitations that arise when delivering digital content using today's point-to-point networks. The strategy behind SODD is twofold. First, SODD exploits massive disks (terabytes to petabytes) at each node to cache data that is likely to be accessed at that node. Second, SODD aggressively delivers data to many client nodes simultaneously using a physical broadcast network."

Data Management: Alon Halevy

Semex
Stefan Bjarni Sigurdsson

"Semex is a platform for building advanced personal information management (PIM) services. It provides a meaningful logical data organization by complementing the application-centric organization of personal data with an entity-centric organization. The latter enables data access by browsing, searching or querying for rich patterns of conceptual associations between entities. The database of entities and links between them is populated automatically by analyzing files in the user's information space. Semex is a step towards Vannevar Bush's memex vision."
Woogle: Intelligent Web Service Search Engine (Powerpoint Slides)
Xin Dong

"Web services are loosely coupled software components, published, located, and invoked across the Web, using standard XML-based protocols. Web services have recently received significant attention because they offer dynamic collaboration with the vision of large-scaled information sharing and ubiquitous computing. To fully exploit the advantages of web services, it is critical to provide web service search that facilitates developers to effectively locate the web services that meet their needs. Compared with traditional webpage search, web service search faces two challenges: 1) Currently there are only thousands of active web services on the web; so recall is more important than precision as an evaluation critique. 2) A web service has a more complex structure than pure text: a web service is composed of several operations, each of which takes a list of input parameters, fulfills a certain task, and returns the result by output parameters. In our work, we improve web service search by exploring the structure and semantics of web services. We provide keyword search on web service functionality and input/output. For each given web service operation, we provide a list of related web service operations. We'll also exploit the possibility of composing web services in the future. Our web service search engine is available at http://barb.cs.washington.edu:8080/won/wonServlet. "
Answering Uncertain Queries in Databases
Nilesh Dalvi

"Unlike Information Retrieval, databases are known for supporting queries that have a precise semantics. This makes it possible for users to formulate complex queries and for systems to apply complex optimizations, but it makes it hard for a casual user to formulate queries without having a precise knowledge of the database. A single misspelling of a constant in the WHERE clause of a SQL query leads to an empty set of answers, frustrating casual users. IR queries offer two important features that are missing in databases: the results are ranked and the matches may be uncertain, i.e. the answer may include documents that do not match all the keywords in the query. Attempts have been made in the past to incorporate ranked results and uncertain matches into database queries, however they have failed to provide a precise semantics that captures both features.

In this paper we describe a new semantics for database queries, based on the probabilistic relational algebra and models for belief. Several notions of "uncertain matches" can be supported, including presence/absence of a keyword, edit distances, ontology-based distances, IDF-similarity and semantic closeness based on a dictionary."
Information Disclosure in Data Exchange
Dan Suciu

"Suppose we want the benefits office to access to the payroll database, but not the salary field. Then we simply define a view that omits salaries. This technique has been tried and tested in database applications, and is now also used when data is exchanged between different independent organization: for each parter, we define a set of view(s) to be shared with that partner, hoping that the secret information in our database is not disclosed to anyone. But is it ? What happens if some of our partners "collude", i.e. share the views they got from us and try to infer more information that was not available in each view separately? We may have thousands of different views exchanged with hundreds of different partners, and it is unclear what it even means for the secret information to be disclosed. In this talk I will discuss some fundamental research done by Gerome Miklau and myself on studying information disclosure in database queries and views. We followed Shannon's information-theoretic approach to define when it means for a secret query S to be "perfectly secure" w.r.t. a set of views V1, ..., Vn, then discuss what it involves to check perfect security automatically. The same query-view security notion can be extended to the case of prior knowledge, such as schema constraints or previously published views. In the talk I will avoid dry theory, and instead illustrate the concepts with (hopefully fun) examples. Work done with Gerome Miklau."

HCI: Richard Anderson

The Use of Digital Ink in Lecture Presentation (Powerpoint Slides)
Crystal Hoyer

"In this talk we present our findings on how instructors use digital ink when delivering lectures from electronic slides. We studied the use of Classroom Presenter, a Tablet PC based lecture presentation system that we developed in a series of univeristy level computer science courses. We found that much of the ink usage was ephemeral, as opposed to archival, and we also identified a style of ink usage that has a similarity to physical hand gestures."
Tactilization of Graphical Images and Image Classification (Powerpoint Slides)
Sahnguyn Hahn

"In recent years, the number of graphs, diagrams, and illustrations in math, science and engineering (MSE) textbooks and other publications at all levels has been growing. These visualizations help students understand better difficult concepts. However, this trend has created a serious barrier to success of the blind students in the MSE field because of the difficulties to convert these visual representations to an accessible form for the blind. Labor-intensive processes, slow reproduction speeds, often requiring experts makes it hard to produce accessible graphical images of high quality in a timely manner. The goal of our project is to alleviate these difficulties by developing software that translate graphical images to tactile format as automatically as possible, and by providing a guideline to produce tactile forms of graphical images of high quality. I will briefly introduce the components our project: 1) identifying the properties of high quality tactile graphics, 2) developing a software that actually produces tactile graphics with such properties, and 3) educating people and disseminating our findings. I will then focus on my current research, the problem of classification of images into types, which will determine which image processing is to be applied to achieve good tactile results. This is joint work with Richard Ladner, Melody Ivory-Ndiaye, and Raj Rao."
Using Design Patterns to Create Cross-Platform Web Sites (Powerpoint Slides)
James Landay

"We are now entering the era of ubiquitous computing, an era where people will access information and services anywhere, anytime, and from a wide variety of devices. One challenge for practitioners is to design user interfaces that work across different devices. Design patterns offer a solution to this problem. Design patterns encapsulate successful design solutions for later reuse. Patterns make it easy for designers to take advantage of proven solutions to common design problems while also giving designers room to creatively solve problems unique to their design. Patterns have a long history in architecture and software engineering, but have only recently been used in interface design. In this talk we will introduce the design pattern approach to Web site design and give you a sampling of the patterns found in our successful book The Design of Sites. We will also demonstrate our ideas for a sketch-based tool that uses design patterns as a way to solve many of the problems with cross device design."

Session III

Architecture and Hardware: Mark Oskin

Visual Pattern Recognition in Analog VLSI
Seth Bridges

"Biometric applications and the miniaturization of electrical and physical systems for embedded and mobile systems impose severe constraints on power, performance, adaptivity, and interfacing. All of these constraints can be met simultaneously by an efficient hardware neural network. One particularly VLSI-amenable network architecture is the Radial Basis Function Neural Network (RBFNN). Implementations of RBF networks in current digital technologies cannot satisfy the low-power and analog interfacing requirements of such highly constrained systems. Prior analog VLSI implementations suffered from low accuracy and cumbersome weight updates. Current analog techniques including the use of synapse transistors allow new approaches to building hardware learning chips. In this talk, I will discuss the application of RBFNN to visual pattern recognition as well as VLSI implementations of RBFNN including our flexible, low-power, and reconfigurable implementation of a synapse transistor based learning system that implements a variety of learning networks."
WaveScalar and the WaveCache
Steve Swanson

"Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, however, is an open challenge. Ever increasing wire-delay relative to switching speed and the exponential cost of circuit complexity make simply scaling up existing processor designs futile. In this paper, we present an alternative to superscalar design, WaveScalar. WaveScalar is a dataflow instruction set architecture and execution model designed for scalable, low-complexity/high-performance processors. WaveScalar is unique among dataflow architectures in efficiently providing traditional memory semantics. At last, a dataflow machine can run "real-world" programs, written in any language, without sacrificing parallelism.

The WaveScalar ISA is designed to run on an intelligent memory system. Each instruction in a WaveScalar binary executes in place in the memory system and explicitly communicates with its dependents in dataflow fashion. WaveScalar architectures cache instructions and the values they operate on in a WaveCache, a simple grid of "alu-in-cache" nodes. By co-locating computation and data in physical space, the WaveCache minimizes long wire, high-latency communication. This paper introduces the WaveScalar instruction set and evaluates a simulated implementation based on current technology. Results for the SPEC and Mediabench applications demonstrate that the WaveCache out-performs an aggressively configured superscalar design by 2-7 times, with ample opportunities for future optimizations."
An Algebraic Foundation for Quantum Programming Languages
Andrew Petersen

"Quantum computation was first proposed two decades ago. Since then, physicists have made great strides toward developing quantum hardware, but we have given little thought to determining how we might program a quantum computer. We propose using an algebraic formalism as a foundation for developing quantum programming languages and modeling the behavior of quantum computing hardware. Our algebra supports abstractions for reasoning about quantum effects and indicates important quantum properties explicitly rather than focusing on describing a physical system, which makes the algebra independent of the implementation of the system."

Theory: Anna Karlin

Maximizing the Spread of Influence in a Social Network (PDF Slides)
David Kempe

"A social network - the graph of relationships and interactions within a group of individuals - plays a fundamental role as a medium for the spread of information, ideas, and influence among its members. An idea or innovation will appear - for example, the use of cell phones among college students, the adoption of a new drug within the medical profession, or the rise of a political movement in an unstable society - and it can either die out quickly or make significant inroads into the population.

The resulting collective behavior of individuals in a social network has a long history of study in sociology. Recently, motivated by applications to word-of-mouth marketing, Domingos and Richardson proposed the following optimization problem: allocate a given "advertising" budget so as to maximize the (expected) number of individuals who will have adopted a given product or behavior.

In this talk, we will investigate this question under the mathematical models of influence studied by sociologists. We present and analyze a simple approximation algorithm, and show that it guarantees to reach at least a 1-1/e (roughly 63%) fraction of what the optimal solution can achieve, under many quite general models. In addition, we experimentally validate our algorithm, comparing it to several widely used heuristics on a data set consisting of collaborations among scientists. Joint work with Jon Kleinberg and Eva Tardos."
Scheduling Techniques for Broadcasting Popular Media (Powerpoint Slides)
Tami Tamir

"Broadcasting is an efficient alternative to unicast delivery of popular on-demand requests. The talk presents new broadcasting schemes for highly loaded on-demand systems. These schemes are based on a new abstract model of the system. They exploit new technologies, which already exist or are expected to be available in the near future, such as the multicast capabilities of the Internet, high receive bandwidth at clients, and fast, cheap, storage at clients. Given these new technologies, the challenge is to find the best way to schedule the data on the multicast channels to provide the best quality of service. For a given bandwidth, our schemes guarantee the minimum possible start-up delay time for an uninterrupted playback."
Algorithms for Dynamic Multicast Key Distribution Trees (Powerpoint Slides)
Justin Goshi, Richard Ladner

"We study the problem of multicast key distribution for group security. Secure group communication systems typically rely on a group key, which is a secret shared among the members of the group. This key is used to provide privacy by encrypting all group communications. Because groups can be large and highly dynamic, it becomes necessary to change the group key in a scalable and secure fashion when members join and leave the group. We present a series of algorithms for solving this problem based on key trees. The algorithms attempt to minimize the worst-case communication cost of updating the group and auxiliary keys by maintaining balanced key tree structures. We focus on the trade-off between the communication cost due to the structure of the tree and that due to the overhead of restructuring the tree to maintain its balanced structure. The algorithms are analyzed for worst-case tree structure bounds and evaluated empirically via simulations."

Graphics and Vision: Steve Seitz

Aseem Agrawala

"We describe an interactive, computer-assisted framework for combining parts of a variety of photographs into a single composite picture, a process we call digital photomontage."
Yi Li

"This talk addresses the challenging task of recognizing common objects in color photographic images. We represent images as sets of feature vectors of multiple types of abstract regions, which come from various segmentation processes. We model each abstract region as a mixture of Gaussian distributions over its feature space. We have developed a new semi-supervised version of the EM algorithm for learning the distributions of the object classes."
Adrien Treuille

"Physics-based simulations are pervasive in computer graphics because of their compelling nuance and realism. We describe the first method for controlling physics-based animations of smoke and water."
Dan Goldman
"We describe a technique for computing the 3D shape and surface materials from a set of photographs, and demonstrate applications for relighting, editing, and transferring materials from one object to another."
Keith Grochow

"This research presents a realtime inverse kinematics system based on a learned model of human poses. This system can be used anywhere realtime IK is used to generate realistic poses from a set of constraints including interactive character posing, keyframed animation, real-time motion capture with noisy data, and 3D motion from 2D images."
Jiwon Kim

"We present an approach for reconstructing the physical state of documents and other objects on a desk over time using an overhead video camera. Two prototype applications are demonstrated that enable a range of interesting queries for the structure and history of the desktop."
Li Zhang

"In this talk, we first describe a projector-camera system to capture 3D shapes of moving faces at high spatial and temporal resolution. Then, we demonstrate how to use the captured shapes to realistically edit and animate facial expressions."

Session IV

<3>Computational Biology: Martin Tompa

A Gentle Introduction to Computational Biology (Powerpoint Slides)
Martin Tompa

"I will give a very short introductory presentation of "Everything you need to know about molecular biology", as background to the remaining talks in this session."
Motif Discovery in Heterogeneous Sequence Data (Powerpoint Slides)
Martin Tompa, Amol Prakash

"We have introduced the first integrated algorithm designed to discover novel motifs in "heterogeneous sequence data", which is composed of genes from a single organism that are believed to share a motif, together with the corresponding genes from related organisms. Results will be presented for such gene sets in yeasts, worms, and mammals. This is joint work with Mathieu Blanchette, Saurabh Sinha, and Martin Tompa."
Faster Genome Annotation of Non-Coding RNA Families Without Loss of Accuracy
Zasha Weinberg

"Non-coding RNAs (ncRNAs) are functional RNA molecules that, in contrast to the usual case in genetics, do not code for proteins. Covariance Models (CMs) are a useful statistical tool based on context-free grammars to search a genome database for novel ncRNAs that resemble a given family of evolutionarily related ncRNAs. CMs can be highly accurate, but unfortunately CM searches are very slow. We show how to make CMs faster while provably sacrificing none of their accuracy. Specifically, based on the CM, our software builds a profile hidden Markov model (HMM), which filters the genome database. This HMM is a "rigorous filter", i.e., its filtering eliminates only sequences that provably could not be annotated as homologs. The CM is run only on what remains. For most known ncRNA families, this allows a genome database of 8 billion DNA nucleotides to be scanned in 2-20 days instead of years, and yields new ncRNAs missed by other techniques to improve CM speed. Joint work with Walter L. Ruzzo."
Gene Function Prediction of Escherichia coli Based on Heterogeneous Data Sources (Powerpoint Slides)
Zizhen Yao

"As a variety of functional genomic and proteomic techniques become available, there is increasing need for functional analysis methodology that integrates heterogeneous data sources. In this paper, we address this issue by proposing a general framework based on the k-nearest-neighbor algorithm. Unlike traditional K-nearest-neighbor algorithms, we use regression methods to infer the similarity metric, which captures the structure of feature space such as differential relevance of features and their correlation. We also suggest a novel voting scheme to generate confidence scores, which estimates the accuracy of predictions. We applied this technique to gene functional prediction in Escherichia coli, using information derived from microarray expression and sequencing data. Compared to three well-known Escherichia coli classification schemes suggested by biologists, our algorithms show satisfactory performance. We demonstrate that our algorithm outperforms SVM algorithm based on microarray data only, and the naive KNN methods based on the microarray data and sequence data. We also show that by combining different data sources, prediction accuracy can improve significantly. "

Compression: Richard Ladner

Reduced Complexity Wavelet-Based Predictive Coding of Hyperspectral Images for FPGA Implementation
Agnieszka Miguel

"We present an algorithm for lossy compression of hyperspectral images for implementation on field programmable gate arrays (FPGA). To greatly reduce the bit rate required to code images, we use linear prediction between the bands to exploit the large amount of inter-band correlation. The prediction residual is compressed using the Set Partitioning in Hierarchical Trees algorithm. To reduce the complexity of the predictive encoder, we propose a bit plane-synchronized closed loop predictor that does not require full decompression of a previous band at the encoder. The new technique achieves almost the same compression ratio as standard closed loop predictive coding and has a simpler on-board implementation. Joint work with Amanda R. Askew, Alexander Chang, Scott Hauck, Richard E. Ladner, Eve A. Riskin."
An Algorithm For Constant-Quality Compressed Video (PDF Slides)
Michael Ringenburg

"Compressed video streams typically suffer from significant quality degradation in the frames immediately following a scene change. This phenomenon can even be observed by setting your TiVo to its lowest quality (highest compression) setting and watching for the blockiness at scene changes. This talk will start by giving a quick introduction to standard video compression techniques, and explaining why this post-scene change quality degradation occurs. We will then describe an algorithm we developed to achieve constant quality video, even after scene changes, using embedded video coders like the University of Washington's GTV coder (Group Testing for Video). We will conclude by showing comparisons to standard MPEG encoded video. Joint work with Richard Ladner and Eve Riskin."
Modifying MELPe Speech Compression with Bit-plane Entropy Coding (PDF Slides)
Chris Parrish

"A method is presented to modify the standard MELPe 2400 bps speech coder in which the prediction coefficients are grouped and a Discrete Cosine transform is applied for favorable energy compaction. The resulting coefficients are quantized by bit-planes and entropy coding is applied to generate the resulting compressed bit-stream. This technique allows for several advantages over the normal vector quantization used in the MELPe standard. Joint work with Richard Ladner and Eve Riskin."

Ubiquitous Computing: Gaetano Borriello

Microbiology Tray and Pipette Tracking as a Proactive Tangible User Interface (Powerpoint Slides)
Harlan Hile

"Many work environments can bene t from integrated computing devices to provide information to users, record users' actions, and prompt users about the next steps to take in a procedure. We focus on the cell biology laboratory, where previous work on the Labscape project has provided a framework to organize experiment plans and store data. Currently developed sensor systems allow amount and type of materials used in experiments to be recorded. This paper focuses on providing the last piece: determining where the materials are deposited. Using a camera and projector setup over a lab bench, vision techniques allow a specially marked well tray and pipette to be located in real time with enough precision to determine which well the pipette tip is over. Using the projector, the tray can be augmented with relevant information, such as the next operation to be performed, or the contents of the tray. Without changing the biologist's work practice, it is possible to record the physical interactions and provide easily available status and advice to the user. Preliminary user feedback suggests this system would indeed be a useful addition to the laboratory environment."
SUPPLE: Automatically Generating User Interfaces (PDF Slides)
Krzysztof Gajos

"In order to give people ubiquitous access to software applications, device controllers, and Internet services, it will be necessary to automatically adapt user interfaces to the computational devices at hand (e.g., cell phones, PDAs, touch panels, etc.). While previous researchers have proposed solutions to this problem, each has limitations. This paper proposes a novel solution based on treating interface adaptation as an optimization problem. When asked to render an interface on a specific device, our Supple system searches for the rendition that meets the device’s constraints and minimizes the estimated effort for the user’s expected interface actions. We make several contributions:
1) precisely defining the interface rendition problem,
2) demonstrating how user traces can be used to customize interface rendering to particular user’s usage pattern,
3) presenting an efficient interface rendering algorithm,
4) performing experiments that demonstrate the utility of our approach."
The Place Lab Location Enhanced Computing Project: A UW and Intel Research Collaboration (Powerpoint Slides)
Jeff Hightower

"Researchers from UW, Intel, UCSD, and elsewhere have joined to bootstrap the broad adoption of location-enhanced web services. We believe location-aware computing must be as effortless, familiar, and rewarding as search tools like Google. Our collaboration is producing open technology that enables private, course grain, indoor and outdoor positioning on cellular mobile computers (i.e., WiFi notebooks, WiFi PDAs, and programmable cell phones) with no additional hardware. Through individual Place Labs initially seeded on the campuses of universities, colleges, and research organizations this initiative will be a vehicle for research, instruction, collaboration and application sharing."