Break-out Topics and Talks

Thursday, October 23, 2008

SESSION I 9:25 - 10:30	Peer-to-Peer and Beyond CSE 305	Information Extraction CSE 403	Socially-Relevant Applications CSE 691
SESSION II 10:40 - 11:30	Pacific Northwest Center for Neural Engineering CSE 305	Interacting with Interference CSE 403	Usable Security and Privacy CSE 691
SESSION III 11:40 - 12:30	Computing for Everyone CSE 305	The Many Faces of Concurrency CSE 403	Pictures, Games, and Movies CSE 691

Session I

Peer-to-Peer and Beyond (CSE 305)

9:25-9:30: Introduction and Overview: Arvind Krishnamurthy

9:30-9:45: Get What You Put: The Makings of a Consistent DHT, Ivan Beschastnikh(PDF Slides)

Modern P2P systems are increasingly relying on distributed storage and lookup services, such as Distributed Hash Tables (DHTs). Deployed DHTs, however, support only weak consistency semantics and often assume an unrealistic level of node and routing homogeneity. We present Harmony, a DHT designed to maintain strict data and routing consistency and adapt to highly dynamic workloads and variable node resources. Harmony manages groups of replicas with provably safe distributed replication and consensus algorithms to maintain consistency in all circumstances. We show that Harmony scales to millions of nodes, leveraging expected node lifetime and other node characteristics to achieve both high availability and high performance.
9:45-10:00: OneSwarm: A System for Inexpensive, Social, Large-Scale Content Distribution, Tomas Isdal(PDF Slides)

Peer assisted content distribution has received significant attention in recent years as a means of externalizing the bandwidth costs associated with Internet-scale services. In spite of this, most services currently rely on large data centers, e.g., YouTube, iTunes. In this talk, we'll describe (and demonstrate) a data distribution layer, called OneSwarm, that allows service providers to offload bulk data transfers to a P2P network. Unlike existing P2P networks, OneSwarm is designed to preserve the privacy of its users and work transparently within a web browser.
10:00-10:15: Practical Reverse Path Traceroute, Ethan Katz-Bassett(PDF Slides)

Traceroute is the most widely used Internet diagnostic tool today. Network operators use it to help identify routing failures, path inflation, and router misconfigurations. Researchers use it to map the Internet, predict performance, geolocate routers, and classify the performance of ISPs. However, traceroute has long had a fundamental limitation: it does not provide reverse path information. Although various public traceroute servers across the Internet provide some visibility, no general method exists for determining a reverse path from an arbitrary destination, without control of that destination.

In this work, we address this longstanding limitation by building a practical reverse traceroute tool. Our tool provides the same information as traceroute, but for the reverse path, and it works in the same case as traceroute, when the user may lack control of the destination. Our approach combines a number of ideas: source spoofing, IP timestamp and record route options, and multiple vantage points. Deploying our system on PlanetLab, we can determine the complete reverse route for more than 40% of cases. We use our reverse traceroute system to study previously unmeasurable aspects of the Internet: we uncover thousands of AS peering links invisible to current topology mapping efforts, and we present a case study of how a content provider could use our tool to troubleshoot poor performance.
10:15-10:30: An End to the Middle, Colin Dixon(PDF Slides)

The last decade has seen a vast proliferation of middleboxes to solve all manner of persistent limitations in the Internet protocol suite. Examples include firewalls, NATs, load balancers, traffic shapers, deep packet intrusion detection, virtual private networks, network monitors, transparent web caches, content delivery networks, and the list goes on and on. This trend has enabled network administrators to provide security, QoS, and other critical services to their users, but at considerable cost in management complexity and increasing inertia to new applications. In this paper, we argue that this long-standing trend is about to come to an end, and we will better off for it. We show that functionality that seemingly must be in the network, such as NATs and traffic prioritization, can more cheaply, flexibly, and securely be provided by distributed software running on end hosts.

Information Extraction (CSE 403)
- 9:25-9:30: Introduction and Overview: Dan Weld
- 9:30-9:51: Intelligence in Wikipedia Dan Weld(PDF Slides)
  
  Berners-Lee's vision of the Semantic Web is hindered by a chicken-and-egg problem, which can be best solved by a bootstrapping method: creating enough structured data to motivate the development of applications. We believe that autonomously `Semantifying Wikipedia' is the best way to bootstrap. We choose Wikipedia as an initial data source, because it is comprehensive, high-quality, not too large, and contains enough manually-derived structure to bootstrap an autonomous, self-supervised process. In this talk I will present our success to date in this endeavor:
  - A novel approach for self-supervised learning of CRF information extractors
  - Automatic construction of a comprehensive ontology via statistical-relational learning
  - Vast improvements in extraction recall through shrinkage over this ontology and retraining
  - The stimulation of a virtuous feedback cycle between communal content creation and information extraction
  We aim to construct a knowledge base of outstanding size to support inference, automatic question answering, faceted browsing, and potentially to bootstrap the Semantic Web.
- 9:51-10:11: Information Extraction from Wikipedia: Moving Down the Long Tail, Fei Wu(PDF Slides)
  
  Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-supervised information extraction. While previous efforts at extraction from Wikipedia achieve high precision and recall on well-populated classes of articles, they fail in a larger number of cases, largely because incomplete articles and infrequent use of infoboxes lead to insufficient training data. This paper presents three novel techniques for increasing recall from Wikipedia’s long tail of sparse classes: (1) shrinkage over an automatically-learned subsumption taxonomy, (2) a retraining technique for improving the training data, and (3) supplementing results by extracting from the broader Web. Our experiments compare design variations and show that, used in concert, these techniques increase recall by a factor of 1.76 to 8.71 while maintaining or increasing precision.
- 10:11-10:30: Scaling Textual Inference to the Web, Stefan Schoenmackers(PDF Slides)
  
  Most Web-based Q/A systems work by finding pages that contain an explicit answer to a question. These systems are helpless if the answer has to be inferred from multiple sentences, possibly on different pages. To solve this problem, we introduce the Holmes system, which utilizes Textual Inference (TI) over tuples extracted from text.
  
  Whereas previous work on TI (e.g., the literature on textual entailment) has been applied to paragraph-sized texts, Holmes utilizes knowledge-based model construction to scale TI to a corpus of 117 million Web pages. Given only a few minutes, Holmes doubles recall for example queries in three disparate domains (geography, business, and nutrition). Importantly, Holmes's runtime is linear in the size of its input corpus due to a surprising property of many textual relations in the Web corpus -- they are ``approximately'' functional in a well-defined sense.
Socially-Relevant Applications (CSE 691)
- 9:25-9:30: Introduction and Overview: James Landay
- 9:30-9:45: Voice-driven Interaction: Harnessing the Capacity of Human Voice for Controlling Computer Interfaces, Susumu Harada(PDF Slides)
  
  Speech and voice-based input has the potential of enabling fluid hands-free interaction with computers, especially for users with various forms of motor impairments. However, traditional speech-based interactions tend to be limited to dictation-based text entry or command-and-control-based discrete interaction. We have been exploring ways to extend the expressivity of voice-based computer input by harnessing the richness of human vocal production beyond spoken words. We present the capabilities of non-speech vocalization as an input modality and the vision of fluid voice-based interaction through a series of projects based on the Vocal Joystick engine developed here at the University of Washington.
- 9:45-10:00: A Life-Size Virtual Robotic Environment for Stroke Rehabilitation, Brian Dellon(PDF Slides)
  
  As the future draws near, the need for assistive robotic devices will increase. During the 1950's, only 4.9% of the world's population was over the age of 65. Today, almost 20% are over 65 and it is predicted to exceed 35% of the world's population by 2050. This demographic shift in world population is coupled with Rate at which stroke and other neurological disorders occur. Robotic solutions will help tackle these issues and enable the elderly to regain their independence and live out enriching and meaningful lifestyles. Integration of robotic manipulation into current rehabilitation practices holds the promise of improving the quality of physical rehabilitation, increasing the efficiency of therapists and allowing more flexible/programmable rehabilitation environments, yet we must consider the safety aspects of such assistive devices for successful integration into domestic and clinical settings. To that end we have created the Brake Actuated Manipulator (BAM), a passive haptic device designed with safety in mind, which allows full body upper-extremity physical rehabilitation paradigms. Furthermore incorporation of visual feedback distortion schemes in virtual reality bolsters the BAM's passive functionality and creates a system where perceptual deficits can be addressed and limited mobility and strength can be trained.
- 10:00-10:15: UbiGreen: Encouraging Environmental Stewardship, Jon Froehlich(PDF Slides)
  
  Mobile phones have the potential to be one of the most transformative technologies in human history, not just as a communication tool but as a persuasive medium to support changing attitudes and behaviors. In the UbiGreen project, we use mobile phones, sensors, and machine learning techniques to automatically recognize transportation behaviors such as walking, biking, and moving in a motor vehicle. The UbiGreen phone application displays iconic, yet provocative images based on semi-automatically sensed green transportation activities. These images reward users for green transportation behaviors, help increase awareness about their transportation patterns, and reveal how transportation behaviors may affect the environment. In this talk, we discuss the potential of mobile phones as persuasive technologies, describe UbiGreen and report on a three-week prototype deployment to thirteen users in Seattle and Pittsburgh.
- 10:15-10:30: Promoting Interaction in Video-Based Agricultural Extension, Natalie Linnell(PDF Slides)
  
  My research seeks to improve classroom and practical education in the developing world using mediated video. In this approach, video-recorded lessons are played back by a semi-skilled facilitator. The interaction the facilitator creates around the video materials is vitally important, thus my research seeks to provide structure and support for this interaction using technology. I will discuss work I did during an internship with Microsoft Research India. I will discuss three different inteventions to increase interaction in Digital Green, which uses mediated video for agricultural extension.

Session II

Pacific Northwest Center for Neural Engineering (CSE 305)

10:40-10:45: Introduction and Overview: Raj Rao

10:45-11:00: Overview and Mission of Neural Engineering Initiative, Yoky Matsuoka(PDF Slides)

The Pacific Northwest region is well-positioned to become a leader in neural engineering and neurotechnology. The world's first experiment demonstrating a neuron's ability to drive an exter-nal device was performed in the early seventies at the University of Washington. Today, many regional companies and industrial labs are forming research teams, collaborating with academic researchers, and developing neural engineering products such as implantable chips and novel human-device interfaces for personal assistance and rehabilitation. The time is right for uniting the neural engineering efforts of academia and industry in the Pacific Northwest region through the creation of a Pacific Northwest Center for Neural Engineering. The Center's mission will be to improve the quality of human life through research, development, and application of neural engineering technologies. Our vision is to build a center that will be the number one destination for students, researchers, and companies that want to be at the forefront of human and assistive device integration and the neural engineering field.

11:00-11:15: EEG-Based Brain-Computer Interaction: Principles and Applications, Reinhold Scherer(PDF Slides)

A Brain-Computer Interface (BCI) is a communication system that allows the user to bypass the efferent pathways of the central nervous system and thus to directly link the human brain with a machine. Such a direct brain-computer interaction can be realized by monitoring, analyzing and classifying the electroencephalogram (EEG) in real-time. Here, we illustrate the possibilities offered from this kind of technology and present results of studies with able-bodied and disabled individuals performed inside the laboratory and in real-world environments. The applications include the control of neuroprostheses and robotic devices, the interaction with Virtual Reality, and the operation of “off-the-shelf” software such as Google Earth.
11:15-11:30: The Neurobotics Lab: Toward Human-level Dexterous Manipulation, Ravi Balasubramanian(PDF Slides)

The Neurobotics Laboratory is focused on using and developing robotic technology as a way to understand, rehabilitate, assist, and enhance human motor control and learning capabilities. In addition, we also investigate the use of a variety of biosignals such as ECoG, EEG, and EMG to explore motor control in the human body. After a brief overview of the field of neurobotics and our lab's projects, this talk will focus on one project that explores how we can achieve human-level dextrous robotic manipulation. Specifically, by combining experiments with human subjects and an anthropomorphic robotic hand, we identify how the human control system exploits the redundant neuro-musculo-skeletal control system to learn and perform daily tasks reliably.

Interacting with Inference (CSE 403)

10:40-10:45: Introduction and Overview: James Fogarty

10:45-11:00: Amplifying Community Content Creation with Mixed Initiative Information Extraction, Raphael Hoffmann(PDF Slides)

Although existing work has explored both information extraction and community content creation, most research has focused on them in isolation. In contrast, we see the greatest leverage in the synergistic pairing of these methods as two interlocking feedback cycles. This paper explores the potential synergy promised if these cycles can be made to accelerate each other by exploiting the same edits to advance both community content creation and learning-based information extraction. We examine our proposed synergy in the context of Wikipedia infoboxes and the Kylin information extraction system. After developing and refining a set of interfaces to present the verification of Kylin extractions as a non-primary task in the context of Wikipedia articles, we develop an innovative use of Web search advertising services to study people engaged in some other primary task. We demonstrate our proposed synergy by analyzing our deployment from two complementary perspectives: (1) we show we accelerate community content creation by using Kylin's information extraction to significantly increase the likelihood that a person visiting a Wikipedia article as a part of some other primary task will spontaneously choose to help improve the article's infobox, and (2) we show we accelerate information extraction by using contributions collected from people interacting with our designs to significantly improve Kylin's extraction performance.
11:00-11:15: CueFlik: Interactive Concept Learning in Image Search, James Fogarty(PDF Slides)

Web image search is difficult in part because a handful of keywords are generally insufficient for characterizing the visual properties of an image. Popular engines have begun to provide tags based on simple characteristics of images (such as tags for black and white images or images that contain a face), but such approaches are limited by the fact that it is unclear what tags end-users want to be able to use in examining Web image search results. This talk presents CueFlik, a Web image search application that allows end-users to quickly create their own rules for re-ranking images based on their visual characteristics. End-users can then re-rank any future Web image search results according to their rule. In an experiment we present in this paper, end-users quickly create effective rules for such concepts as “product photos”, “portraits of people”, and “clipart”. When asked to conceive of and create their own rules, participants create such rules as “sports action shot” with images from queries for “basketball” and “football”. CueFlik represents both a promising new approach to Web image search and an important study in end-user interactive machine learning.
11:15-11:30: Examining Difficulties Software Developers Encounter in the Adoption of statistical Machine Learning, Kayur Patel(Presentation)

Statistical machine learning continues to show promise as a tool for addressing complex problems in a variety of domains. An increasing number of developers are therefore looking to use statistical machine learning algorithms within applications. We have conducted two initial studies examining the difficulties that developers encounter when creating a statistical machine learning component of a larger application. We first interviewed researchers with experience integrating statistical machine learning into applications. We then sought to directly observe and quantify some of the behavior described in our interviews using a laboratory study of developers attempting to build a simple application that uses statistical machine learning. This talk presents the difficulties we observed in our studies, discusses current challenges to developer adoption of statistical machine learning, and proposes potential approaches to better supporting developers creating statistical machine learning components of applications.

Usable Security and Privacy (CSE 691)

10:40-10:45: Introduction and Overview: Tadayoshi Kohno

10:45-11:00: Social Access Control through Shared Knowledge Questions, Michael Toomim

The reason the Internet is making our personal lives public is that we actively upload and broadcast ourselves. We use MySpace, YouTube, and Facebook more than any websites except Google and Yahoo. The Internet offers unprecedented socializing efficiency. However, it requires a tradeoff: you can socialize either online or offline; and choose a life that is either public or private.

Our real lives, on the other hand, are mostly semi-private: we share them with some people, but not others. We are developing a new method of social access control, where users devise a set of simple questions of shared knowledge to guard sets of social content instead of managing many black & white accounts, passwords, and access control lists. We implemented a prototype and conducted studies to explore the context of photo sharing security, gauge the difficulty of creating shared knowledge questions, measure their resilience to adversarial attack, and evaluate users' ability to understand and predict this resilience.
11:00-11:15: Security and Privacy for Wireless Medical Devices, Tamara Denning(PDF Slides)

Implantable medical device (IMD) technologies --- which are wholly or partially implanted within patients' bodies --- are not only enabling new medical therapies with the potential of greatly improving patients' lives, but are incorporating more sophisticated wireless transceivers and becoming more computationally complex. The technological trends, coupled with the physiological importance of these medical devices, suggest potentially harmful consequences if these devices fail to provide appropriate security and privacy safeguards. Research from the University of Washington and other colloborators shows that it is currently possible for an adversary to use his or her own equipment to reprogram an implantable defibrillator, exploit the defibrillator to compromise the patient's privacy, or even exploit the defibrillator to cause a potentially fatal heart rhythm. We are now interested in investigating technological designs that protect patients from unauthorized wireless commands while simultaneously allowing emergency medical staff to have control over a patient's IMD.
11:15-11:30: Context Aware Security for RFID's, RFID Credit Cards, Access Control Badges, and More, Karl Koscher

We tackle the problem of defending against ghost-and-leech (a.k.a. proxying, relay, or man-in-the-middle) attacks against RFID tags and other contactless cards. The approach we take — which we dub secret handshakes — is to incorporate gesture recognition techniques directly on the RFID tags or contactless cards. These cards will only engage in wireless communications when they internally detect these secret handshakes. We demonstrate the effectiveness approach by implementing our secret handshake recognition system on a passive WISP RFID tag with a built-in accelerometer. Our secret handshakes approach is backward compatible with existing deployments of RFID tag and contactless card readers. Our approach was also designed to minimize the changes to the existing usage model of certain classes of RFID and contactless cards, like access cards kept in billfold and purse wallets, since for example one can perform these secret handshakes without removing the card from one's wallet.

Session III

Computing for Everyone (CSE 305)

11:40-11:45: Introduction and Overview: Richard Ladner

11:45-12:00: MobileASL, Neva Cherniavsky(PDF Slides)

The goal of the Mobile ASL project is to compress sign language video so that deaf users can communicate via cell phone. While the deaf community has welcomed new technologies such as the Blackberry and other PDAs, it is cumbersome to use text messaging when compared to signing, since the speed of sign language is equivalent to speech. However, with current compression technology and low mobile phone bit rates, real-time video transmission is not possible. Our goal is to make cell phones accessible by supporting real-time compression and transmission of sign language video.
12:00-12:15: Academic Access for Deaf and Hard of Hearing Students, Anna Cavender(PDF Slides)

High bandwidth internet connections in classrooms, online multimedia and video sharing, and social networking can bring together geographically dispersed students who are deaf and hard of hearing and improve academic access in science, technology, engineering, and mathematics (STEM). Two projects include ClassInFocus, a platform to improve accessibility in the classroom, and ASL-STEM Forum, an online video forum to facilitate discussion about sign language for STEM topics.
12:15-12:30: Usable Available Web Access, Jeff Bigham

Realizing the full promise of an inclusive web involves more than just making access to content possible. In this talk, we'll use two research projects to explore "usability" and "availability", two complementary components of accessibility*. First, we'll consider availability through WebAnywhere, an interface to the web that brings the assistive technology that users need to any computer, even locked-down public terminals, enabling access to the web from any computer. Second, we'll look at audio CAPTCHAs, demonstrate their difficulty relative to their visual counterparts, and show how improving the interface used to solve them can dramatically improve the success rate of humans without impacting security.

The Many Faces of Concurrency (CSE 403)

11:40-11:45: Introduction and Overview: Susan Eggers

11:45-12:00: Atom-Aid: Detecting and Surviving Atomicity Violations, Brandon Lucia

Writing shared-memory parallel programs is extremely error-prone. Among the most common and pernicious concurrency errors are atomicity violations, which occur when programmers fail to ensure the atomicity of regions of code that should have been executed atomically by enclosing them in a critical section.

Recent research has proposed arbitrarily grouping dynamic memory operations into atomic blocks, in order to enforce memory ordering at a coarse grain. In addition to enforcing memory ordering, these techniques probabilistically survive atomicity violations by reducing the number of opportunities for interleaving of memory operations between executing threads. Building on this idea, Atom-Aid creates atomic blocks "intelligently" (rather than arbitrarily), dramatically reducing even further the probability that atomicity violations will occur. The technique pinpoints potential atomicity violations and executes them in an atomic block, thus providing reliable execution and debuggability. We evaluate Atom-Aid using buggy code from applications, including Apache, MySQL, and XMMS, showing that Atom-Aid virtually eliminates the manifestation of atomicity violations.
12:00-12:15: MOSAIC: Coarse-grained Reconfigurable Architecture Exploration, Stephen Friedman (PDF Slides)

Coarse-grained reconfigurable architectures (CGRAs) have the potential to offer performance approaching an ASIC with the flexibility similar to a digital signal processor in highly parallel applications. In the past, individual points in this architectural space have been explored, each with either no programming support or a unique but difficult programming model.

The goal of our research is to explore the CGRA space at the level of complete systems, including language, compiler, and architectural features. In this talk, we give a high level overview of our top-to- bottom tool-chain and look at flexibility in CGRAs using time-division multiplexing and resource sharing.
12:15-12:30: CHiMPS: Accelerating C code with customized caches, Andrew Putnam(PDF Slides)

CHiMPS is a C-based accelerator compiler for heterogeneous CPU-FPGA computing platforms. Its goal is to facilitate FPGA programming for high-performance computing (HPC) developers by providing them with performance that is greater (7.8x) and power consumption that is less (6.5x) than their current CPU platforms, but without sacrificing their familiar, non-HDL programming environment. The key to CHiMPS performance is its novel many-cache memory model, which generates caches that are customized for an application's memory accesses.

Pictures, Games, and Movies (CSE 691)

11:40-11:45: Introduction and Overview: Brian Curless

11:45-12:00: Finding Paths through the World's Photos, Rahul Garg(PDF Slides)

When a scene is photographed many times by different people, the viewpoints often cluster along certain paths. These paths are largely specific to the scene being photographed, and traverse interesting regions and viewpoints. We seek to discover a range of such paths and turn them into controls for image-based rendering. Our approach takes as input a large set of community or personal photos, reconstructs camera viewpoints, and automatically computes orbits, panoramas, canonical views, and optimal paths between views. The scene can then be interactively browsed in 3D using these controls or with five degree-of-freedom free-viewpoint control. As the user browses the scene, nearby views are continuously selected and transformed, using control-adaptive reprojection techniques.
12:00-12:15: Fold It!: Biochemical Discoveries through Video Games, Seth Cooper

The Fold It! project is about engaging a large amount of people to -- together with computers -- solve some of the most important scientific questions of today. We are developing a massively distributed biochemistry game that will enable future key discoveries in molecular science. We are casting molecular folding problems as a massively distributed 3D puzzle game, and plan to enable players use their computers to discover the solutions to current open scientific problems, including cures for cancer, AIDS, and discovery of novel biofuels. The fundamental idea is to do user-assisted optimization for protein design, but to formulate and present it as a competitive game played by thousands of people.
12:15-12:30: Animation Production at UW: Pre-Production, Character Motion, Story and Digital Production, Barbara Mones

For the past ten years, upper level computer science undergraduates have joined students from many other areas of the University (English, Cinema Studies, DXArts, Music and Achitecture to name a few) in a collaborative and interdisciplinary adventure to design and produce a short animated film from initial concept to completion. We'll screen some of the best work we've produced to date and then provide several "shot breakdowns" from our films. We'll also show unique strategies for consistent stylistic collaboration and for facial expressions and acting. This will give you an inside view into the challenges the groups face and an understanding of the production pipeline with an emphasis on how we iterate character motion to support our story.