Thursday, November 4, 2004

SESSION I
9:40 - 10:30
A Microcosm for Understanding RFID
CSE 305
Our Role in the Biology Revolution
CSE 403
Bringing Your Pictures to Life
CSE 691
SESSION II
10:40 - 11:30
Making Life Easier for Disabled People
CSE 305
Search and Discovery
CSE 403
Making Systems more Reliable and Secure
CSE 691
SESSION III
11:40 - 12:30
Location! Location! Location!
CSE 305
Internet Performance and Security
CSE 403
Smart User Interfaces
CSE 691

Session I

    A Microcosm for Understanding RFID (CSE 305)

    • 9:40-9:45: Introduction and Overview: Gaetano Borriello (Powerpoint Slides)

    • 9:45-10:00: An RFID Ecosystem, Gaetano Borriello and David Kaplan (Powerpoint Slides)

      "Information technology has come far in recent years, bringing the cost of computation, storage, and communication components so low as to make these resources virtually unlimited, and enabling applications that seemed impossible a few years ago. However, despite these advances, a large gap remains between the physical world and digital information systems. Radio-frequency identification (RFID) systems associate radio-enabled tags with physical objects. The tags are read by nearby readers, from which typical passive tags also harvest their power. Tags contain a unique identifier, and can also provide information from onboard memory or from connected sensors (such as temperature or pressure sensors). RFIDs promise to make almost any physical object identifiable in the digital world at extremely low cost. We seek seeks to create an RFID Ecosystem - a large-scale, realistic test-bed for the pervasive deployment of RFID technology, systems infrastructure, and ubiquitous computing applications. We want to deploy a sufficient number of readers in our building to track objects throughout our space, enable occupants to use RFID tags embedded in the environment, and provide a range of applications that will help us explore the technology and social implications of ubiquitous RFID systems."

    • 10:00-10:15: Reminding about Tagged Objects using Passive RFIDs, Cameron Tangney and Matthew Hall (Powerpoint Slides)

      "People often misplace objects they care about. We present a system that generates reminders about objects left behind by tagging those objects with passive RFID tags. Readers positioned in the environment frequented by users read tags and broadcast the tags' IDs over a short-range wireless medium. A user's personal server collects the read events in real-time and processes them to determine if a reminder is warranted or not. The reminders are delivered through the personal server's wristwatch UI through a combination of text and iconic messages and audible beeps. We believe this leads to a practical and scalable approach in terms of system architecture and user experience as well as being more amenable to maintaining user privacy than previous approaches. We present results that demonstrate that current RFID tag technology is appropriate for this application."

    • 10:15-10:30: A Handheld Wireless RFID Reader and its Applications: The SF Exploratorium eXspot Project, Waylon Brunette (Powerpoint Slides)

      "Together with SkyeTek and Crossbow, we have devloped a small handheld near-field RFID reader. We have applied it in several interesting applications. We'll describe a set of applications developed by Industrial Design and Computer Engineering students in a capstone design class ranging from children's games to aids for the blind, and a project we have underway with the San Francisco Exploratorium to enhance museum experiences."s

    Our Role in the Biology Revolution (CSE 403)

    • 9:40-9:45: Introduction and Overview: Martin Tompa (Powerpoint Slides)

    • 9:45-10:00: Potholes on the Way Towards Discovery of Functional DNA Elements Through Comparative Sequence Analysis, Amol Prakash (Powerpoint Slides)

      "I will present the problem of taking the whole genomes of all the vertebrates whose DNA has been sequenced (human, chimp, mouse, rat, chicken, 2 fishes, and soon dog and cow) and using comparative sequence analysis to find those elements of the DNA responsible for regulating gene expression. It's a difficult problem and full of subtle pitfalls (most due to biologists' incomplete understanding of these genomes). We has discovered many of these pitfalls and have devised some clever solutions. As a result we have produced some very high quality candidates of DNA regulatory elements. This is work done with my advisor Martin Tompa. All are welcome to attend this talk. No assumption will be made regarding the technical background of the audience."

    • 10:00-10:15: RNA Informatics (Europes Biggest Cycle-User?): What We're Doing About It / What We're Learning, Walter L. Ruzzo (PDF Slides)

      "One of the biggest users of scientific computing cycles in Europe is a bioinformatics application -- genome-wide searches for "non-coding RNAs" (ncRNAs) routinely monopolize 1000 computers for a month. ncRNAs are functional RNA molecules that, in contrast to the usual case in biology, do not code for proteins. A rush of discoveries in the last few years has greatly expanded their number, variety, and significance. Statistical models based on probabilistic context-free grammars are the leading approach to describing ncRNA families and searching for new members. They are limited, however, in that (a) constructing the models is somewhat laborious, and especially (b) searches are very slow -- years of CPU time. We have developed novel algorithms addressing both issues, and they have already led us to discovery of unusual new families of ncRNAs, including the novel cooperative riboswitch recently published in "Science." This is joint work, of course, with my students Zasha Weinberg and Zizhen Yao."

    • 10:15-10:30: Knowledge Sharing in Molecular Biology, Peter Mork (Powerpoint Slides)

      "Biological research demands more agile and distributed data management solutions than are offered by off-the-shelf data warehouse systems. The first challenge is the need for recent data; this can be addressed using data integration technology. Moreover, given the rate at which sources evolve, the second challenge is that the system needs to be easily reconfigurable. My research addresses this challenge by extending peer data management techniques into the realm of biomedicine. I am developing mapping rules and algorithms for query reformulation suited to this domain."

    Bringing Your Pictures to Life (CSE 691)

    • 9:40-9:45: Introduction and Overview: Zoran Popović

    • 9:45-10:00: Animated Images, Dan Goldman (Powerpoint Slides)

      "We explore the problem of taking a still picture and making it move in convincing ways. Consider the domain of scenes containing passive elements that respond to natural forces in some oscillatory fashion. We present a semi-automatic approach, in which a human user segments the scene into a series of layers to be individually animated. The automatic part of the approach works by synthesizing a "stochastic motion texture" using a spectral method -- i.e., a filtered noise spectrum whose inverse Fourier transform is the motion texture. The motion texture is a time-varying 2D displacement map, which is applied to each layer. The resulting warped layers are recomposited, along with "in-painting" to fill any holes, to form the animated frames. The result is a video texture created from a single still image, which has the advantages of being more controllable and of generally higher image quality and resolution than a video texture created from a video source. We demonstrate the technique on a variety of photographs and paintings."

    • 10:00-10:15: Flow-based Video Editing, Steve Seitz (Powerpoint Slides)

      "This talk describes a novel algorithm for synthesizing and editing video of natural phenomena that exhibit continuous flow patterns. The algorithm analyzes the motion of textured particles in the input video along user-specified flow lines, and synthesizes seamless video of arbitrary length by enforcing temporal continuity along a second set of user-specified flow lines. The algorithm is simple to implement and use. We used this technique to edit video of waterfalls, rivers, flames, and smoke."

    • 10:15-10:30: Keyframe-Based Tracking for Rotoscoping and Animation, Aseem Agarwala (Powerpoint Slides)

      "We describe a new approach to rotoscoping --- the process of tracking contours in a video sequence --- that combines computer vision with user interaction. In order to track contours in video, the user specifies curves in two or more frames; these curves are used as keyframes by a computer-vision-based tracking algorithm. The user may interactively refine the curves and then restart the tracking algorithm. Combining computer vision with user interaction allows our system to track any sequence with significantly less effort than interpolation-based systems --- and with better reliability than "pure" computer vision systems. Our tracking algorithm is cast as a spacetime optimization problem that solves for time-varying curve shapes based on an input video sequence and user-specified constraints. We demonstrate our system with several rotoscoped examples. Additionally, we show how these rotoscoped contours can be used to help create cartoon animation by attaching user-drawn strokes to the tracked contours."

Session II

    Making Life Easier for Disabled People (CSE 305)

    Search and Discovery (CSE 403)

    • 10:40-10:45: Introduction and Overview: Oren Etzioni

    • 10:45-11:00: Schema Matching, Jayant Madhavan (Powerpoint Slides)

      "Schema Matching is the problem of identifying corresponding elements in different schemas. Past solutions to this inherently difficult problem have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In my talk, I will describe Corpus-based Schema Matching, a new approach that shows that a corpus, or collection of known schemas and mappings, can be used to improve the ability to match even unseen schemas. In particular, such a corpus can be used to augment the evidence available in the unseen schemas being matched and also learn schema design patterns that can be used to prune candidate matches. This is joint work with Alon Halevy, Philip Bernstein (Microsoft Research) and AnHai Doan (University of Illinois - Urbana Champaigne)."

    • 11:00-11:15: Web-scale Information Extraction in KnowItAll (Preliminary Results) , Oren Etzioni (Powerpoint Slides)

      "Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank potentially relevant documents for human perusal, but do not extract facts, assess confidence, or fuse information from multiple documents. My talk introduces KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner."

    • 11:15-11:30: Searching Web Services, Luna Dong (Powerpoint Slides)

      "Web services are loosely coupled software components, published, located, and invoked across the web. The growing number of web services available within an organization and on the Web raises a new and challenging search problem: locating desired web services. Traditional keyword search is insufficient in this context: the specific types of queries users require are not captured, the very small text fragments in web services are unsuitable for keyword search, and the underlying structure and semantics of the web services are not exploited. In this talk we present Woogle, a web service search engine. Besides traditional keyword search, Woogle supports similarity search, template search and composition search. Furthermore, Woogle can automatically invoke the returned web services, fill in input parameters specified by users, and return the outputs of the web services. We will outline novel techniques to support these types of searches, and demonstrate the system."

    Making Systems more Reliable and Secure (CSE 691)

    • 10:40-10:45: Introduction and Overview: Dan Grossman

    • 10:45-11:00: Atomic Sections for Modern Languages: Improving the Reliability and Security of Concurrent Software, Michael Ringenburg (PDF Slides)

      "Concurrency has been an important and widely-used programming idiom for close to 40 years. Unfortunately, concurrent programming is also a common source of software errors, resulting in incorrect behavior, crashes, and security vulnerabilities. Programmers attempt to avoid these dangers with synchronization primitives like locks and semaphores. However, in order for these primitives to be effective, the entire program must obey subtle invariants. In particular, a piece of code may malfunction because of the failure of another piece of code to acquire a lock (race conditions) or to release a lock (deadlocks). This other piece of code may be in a separate procedure or source file, or even written by a different programmer. We claim that a better synchronization primitive is atomic. If a function or block of code is marked with atomic, the language implementation should guarantee that the the block "appears" to be executed all at once, or atomically, from the perspective of the other threads. In this talk, I will explain why atomic is the right shared memory synchronization primitive, and show how it can be implemented efficiently for all systems where threads sharing memory do not exhibit true parallelism. If time permits, I will also mention ideas we have for extending this work to systems with true thread-level parallelism."

    • 11:00-11:15: A Tamper-Evident Database System, Gerome Miklau (PDF Slides)

      "Data integrity is an assurance that data has not been modified in an unauthorized manner. A number of integrity vulnerabilities threaten current database systems. In this talk I will describe the design of a tamper-evident database system that allows a client to store, query, and update data on an untrusted server with strong guarantees that the server cannot modify data."

    • 11:15-11:30: Using Time Travel to Diagnose Configuration Errors, Andrew Whitaker (Powerpoint Slides)

      "This work addresses the problem of diagnosing configuration errors that cause a system to function incorrectly. For example, a change to the local firewall policy could cause a network-based application to malfunction. Our approach is based on searching across time for the instant the system transitioned into a failed state. Based on this information, a troubleshooter or administrator can deduce the cause of failure by comparing system state before and after the failure. We present the Chronus tool, which automates the task of searching for a failure-inducing state change. Chronus takes as input a user-provided software probe, which differentiates between working and non-working states. Chronus performs ``time travel'' by booting a virtual machine off the system's disk state as it existed at some point in the past. By using binary search, Chronus can find the fault point with effort that grows logarithmically with log size. We demonstrate that Chronus can diagnose a range of common configuration errors for both client-side and server-side applications, and that the performance overhead of the tool is not prohibitive."

Session III

    Location! Location! Location! (CSE 305)

    • 11:40-11:45: Introduction and Overview: Dieter Fox

    • 11:45-12:00: Determining Location From Wireless Communication Signals, Julie Letchner (Powerpoint Slides)

      "Knowledge of a person's location is the foundation of context-aware computing. Applications of location information range from giving directions to inferring activities to detecting when an elderly person is lost. We present a system that can use cell phones or laptop computers in combination with wireless signals to pinpoint a user's location within commonly available street maps. The system is able to use information from wireless access points and GSM cell phone towers. We discuss Bayesian learning techniques to estimate and adapt the sensor model over time. As a result, location estimation improves with increased use of the system. We demonstrate preliminary results with less than 20 meters of location error."

    • 12:00-12:15: Extending Place Lab to 3-D, Alan Liu (PDF Slides)

      "How would a mobile computer locate itself inside a building? Our work is based on the Place Lab architecture, which consists of three elements: radio beacons in the environment, databases that hold information about beacons, and clients that use this data to estimate their current location. However, Place Lab is tuned for the outdoors and gives estimates in two dimensions only. Methods for extending Place Lab to provide estimates with room and floor precision, and preliminary results, will be presented."

    • 12:15-12:30: Extracting Places from Traces of Locations, Jong Hee Kang (PDF Slides)

      "Location-aware systems are proliferating on a variety of platforms from laptops to cell phones. Locations are expressed in two principal ways: coordinates and landmarks. However, users are often more interested in "places" rather than locations. A place is a locale that is important to an individual user and carries important semantic meanings such as being a place where one works, lives, plays, meets socially with others, etc. Our devices can make more intelligent decisions on how to behave when they have this higher level information. For example, a cell phone can switch to a silent mode when the user is in a quiet place (e.g., a movie theater, a lecture hall, or a place where one meets socially with others). It would be tedious to define this in terms of coordinates. In this paper, we describe an algorithm for extracting significant places from a trace of coordinates, and evaluate the algorithm with real data collected using Place Lab [13], a coordinate-based location system that uses a database of locations for WiFi hotspots."

    Internet Performance and Security (CSE 403)

    • 11:40-11:45: Introduction and Overview: David Wetherall

    • 11:45-12:00: A Unified Security Model for Web Applications, Rick Cox (PDF Slides)

      "Our web browsers have evolved into complete operating systems, providing a platform for complex web applications. In this role, it is the browser that must provide the abstractions, resource allocation, and isolation mechanisms that allow users to manage the numerous web applications they use on a daily basis. However, the current security models --- based on controlling interaction between individual documents --- make the browser's task significantly more difficult than it need be. In this talk, I will discuss the problems faced by both developers and users of web applications that are related to the lack of any explicit definition for these applications. I will also present a prototype browser that takes advantage of a strong definition for web applications, providing more trustworthy isolation and better management interfaces while requiring only minimal changes to existing sites. Providing a more secure browser environment also has advantages for developers, who can use our tools to provide better protection for their user's accounts."

    • 12:00-12:15: Distributed Detection of Anomalous Aggregate Traffic Flows, Ankur Jain (Powerpoint Slides)

      "As more open distributed systems, overlay and P2P networks are deployed, there is a greater need for mechanisms that monitor overall system activity rather than the behavior of individual nodes. Existing algorithms such as periodic distributed queries are poorly suited to this task, being either expensive or slow at detecting violations of aggregate properties. In this talk, I will present CoRaL, a new family of mechanisms that quickly detects aggregate anomalies, yet incurs little communication overhead in the common case that the system behaves normally. I illutrate the design and implemetation of the system and show some results using our canonical example that the aggregate traffic sent from all nodes of a system does not exceed a fixed rate limit - as would happen is a Distributed DoS (DDoS) attack. I will also discuss how these mechanisms could be in other settings such as corelating Intrusion Detection Systems (IDS) hits cheaply on an Internet-wide scale to detect worm/virus outbreaks quickly and accurately."

    • 12:15-12:30: Data turbine, Gaurav Bhaya (Powerpoint Slides)

      "Today, a large amount of compelling digital content, including audio, video, and news, can be found on hundreds of thousands of unscheduled continuous data streams on the Internet. The lack of a schedule coupled with the sheer number of streams makes it extremely difficult for users to find the streaming content they desire. In response, we propose a new technology called the Data Turbine that quickly and efficiently locates digital content floating within a large number of Internet streams. In order to explore the feasibility of the Data Turbine, we have designed, simulated, and implemented Radio Turbine, a software system that allows users to find desired audio content, such as music, within any one of the tens of thousands of easily-discovered Internet radio streams. With Radio Turbine, a user can find the majority of songs of their choosing in a short period of time. For example, using a playlist of 100 titles available from a major Internet music store, Radio Turbine was able to find over half within the first two hours, and nearly 80% within the first twelve. Moreover, Radio Turbine can find titles more quickly and more completely than a popular peer-to-peer music sharing system. Unlike such systems, though, Radio Turbine does not induce users to redistribute music, violating copyrights. In this talk, I shall describe and demonstrate a particular implementation of Radio Turbine which we built."

    Smart User Interfaces (CSE 691)

    • 11:40-11:45: Introduction and Overview: Dan Weld

    • 11:45-12:00: Speech, Ink, and Slides: The Interaction of Content Channels, Craig Prince and Jon Su (Powerpoint Slides)

      "A growing number of systems support the use of digital ink on electronic slides for the delivery of presentations. The motivation for adding ink on slides as a new information channel is to increase speaker's flexibility in presenting material and to support links between spoken utterances and slide content. In this talk, we report on an empirical exploration of digital ink and speech usage in lecture presentation using data collected from five Masters level Computer Science course. Our interest in understanding how ink and speech are used together is to inform the development of future tools for supporting classroom presentation, distance education, and viewing of archived lectures. We want to make it easier to interact with electronic materials and to extract information from them. We want to provide an empirical basis for addressing challenging problems such as automatically generating full text transcripts of lectures, matching speaker audio with slide content, and recognizing the meaning of the instructor's ink. Our results include an evaluation of handwritten word recognition in the lecture domain, an approach for associating attentional marks with content, an analysis of linkage between speech and ink, and an application of recognition techniques to infer speaker actions."

    • 12:00-12:15: Automatically Generating Adaptable User Interfaces, Krzysztof Gajos (PDF Slides)

      "Interfaces shipped with today’s complex applications are designed in a “one size fits all” manner; alas, by aiming to address the needs of the “average user” they miss essential needs of most individual users. In contrast, we believe each user deserves a custom-built UI that best reflects her needs. Realizing this dream is complicated by the shift away from the desktop and toward pervasive computing. Most of today’s applications are designed to work with keyboard and pointer and assume a small range of screen sizes. However, people are using an increasing variety of display-equipped devices, which employ different interaction techniques and span a huge range of display sizes (e.g., cell-phones to live boards). In response, we are creating SUPPLE, a system that generates an interface which optimizes the user’s expected utility on the device at hand and adapts as appropriate to changes in user activity. After a brief summary of SUPPLE as presented at last Affiliates meeting, I will describe our recent progress:
      • Since SUPPLE’s behavior depends on an accurate estimate of the user’s utility function, we use every interaction (e.g., user’s customization commands) to refine the system’s utility estimate.
      • SUPPLE can dynamically create “one touch” access to commonly-used functionality, while maintaining a stable, predictable interface structure.
      • SUPPLE’s flexible customization facility uses machine learning to interpret a user’s request as potentially applying to more than one aspect of an interface, e.g., perhaps to multiple applications."

    • 12:15-12:30: Digital Simplicity: Usable Home Ubicomp, James Landay (Powerpoint Slides)

      "There are many indicators showing that people feel technology is speeding up and complicating their lives. In response, many individuals reject certain computing and communications technologies when given the choice (e.g., for use in their homes). Current research in ubiquitous computing has a tendency to fall into this same trap. Purveyors of these technologies tend to force customers into a Faustian bargain: “take this complex technology into your personal life and you will now be able to do these new functions: A, B, & C”. The Digital Simplicity project tries to instead offer a different value proposition: “take this simple technology into your personal life and we will make the activities A, B, &, C you are currently doing simpler.” In this talk I give an overview of the technical, design, and applications research we are carrying out to make Digital Simplicity a reality."