Graduate Student Exams

See a listing of previous talks.

Alex Polozov, Final Exam

Title: A Framework for Mass-Market Inductive Program Synthesis

Advisors: Zoran Popovic and Sumit Gulwani

Supervisor Committee: Zoran Popovic (co-Chair), Sumit Gulwani (co-Chair), Gary Hsieh (GSR, HCDE), Emina Torlak, and Ras Bodik

Abstract:

Programming by examples (PBE), or inductive program synthesis, is a problem of finding a program in the underlying domain-specific language (DSL) that is consistent with the given input-output examples or constraints. In the last decade, it has gained a lot of prominence thanks to the mass-market deployments of several PBE-based technologies for data wrangling — the widespread problem of transforming raw datasets into a structured form, more amenable to analysis.
However, deployment of a mass-market application of program synthesis is challenging. First, an efficient implementation requires non-trivial engineering insight, often overlooked in a research prototype. This insight takes the form of domain-specific knowledge and heuristics, which complicate the implementation, extensibility, and maintenance of the underlying synthesis algorithm. Second, application development should be fast and agile, tailoring to versatile market requirements. Third, the underlying synthesis algorithm should be accessible to the engineers responsible for product maintenance.
 
In this work, I show how to generalize the ideas of 10+ previous specialized inductive synthesizers into a single framework, which facilitates automatic generation of a domain-specific synthesizer from the mere definition of the corresponding DSL and its properties. PROSE (PROgram Synthesis using Examples) is the first program synthesis framework that explicitly separates domain-agnostic search algorithms from domain-specific expert insight, making the resulting synthesizer both fast and accessible. The underlying synthesis algorithm pioneers the use of deductive reasoning for designer-defined domain-specific operators and languages, which enables mean synthesis times of 1-3 sec on real-life datasets.
 
Over the last two years, a dedicated team at Microsoft has built and deployed 10+ technologies on top of the PROSE framework. Using them as case studies, I examine the user interaction challenges that arise after a mass-market deployment of a PBE-powered application. In this talk, I'll show how expressing program synthesis as an interactive problem facilitates user intent disambiguation, incremental learning from additional examples, and increases the users' confidence in the system.
 
When: 2 Jun 2017 - 9:00am Where: CSE 503

Luheng He, General Exam

Title: Annotating, Learning, and Representing Shallow Semantic Structure

Advisor: Luke Zettlemoyer

Supervisory Committee: Luke Zettlemoyer (Chair), Gina-Anne Levow (GSR, Linguistics), Yejin Choi, and Noah Smith

Abstract: One key challenge to understanding human language is to find out the word to word semantic relations, such as “who does what to whom”, “when”, and “where”.  Semantic role labeling (SRL) is the widely studied challenge of recovering such predicate-argument structure, typically for verbs. SRL is designed to be consistent across syntactic annotation and to some extent, language independent, which can potentially benefit downstream applications such as natural language inference, machine translation and summarization. However, the performance of SRL system is limited by the amount of training data, further impeding its usage in downstream applications. My generals work is three folds: 1) collecting more SRL data using natural language driven QA annotation, 2) using end-to-end neural models to accurately predict SRL, and 3) proposing a unified framework for predicting and representing SRL structure to improve performance on downstream applications.

When: 2 Jun 2017 - 10:00am Where: CSE 678

Christopher Lin, Final Exam

Title: The Intelligent Management of Crowd-Powered Machine Learning 

Advisors: Dan Weld and Mausam

Supervisory Committee: Dan Weld (co-Chair), Mausam (co-Chair), Thomas Richardson (GSR, STAT), Eric Horvitz (MSR), and Carlos Guestrin

Abstract: 

Artificial intelligence and machine learning powers many technologies today, from spam filters to self-driving cars to medical decision assistants. While this revolution has hugely benefited from developments in core areas like deep learning, it also could not have occurred without data, which nowadays is frequently procured at massive scale from crowds. Because data is so crucial, a key next step towards truly autonomous agents is the design of better methods for intelligently managing the now-ubiquitous crowd-powered data-gathering process.
This dissertation takes this key next step by developing algorithms for the online and dynamic control of data acquisition. We consider how to gather data for its two primary and independent purposes: training and evaluation.

In the first part of the dissertation, we develop algorithms for obtaining data for testing. The most important requirement of testing data is that it must be extremely clean. Thus to deal with noisy human annotations, machine learning practitioners typically rely on careful workflow design and advanced statistical techniques for label aggregation. A common process involves designing and testing multiple crowdsourcing workflows for their tasks, identifying the single best-performing workflow, and then aggregating worker responses from redundant runs of that single workflow. We improve upon this process in two ways: we build a control models that allow for switching between many workflows depending on how well a particular workflow is performing for a given example and worker; and we build a control model that can aggregate labels from tasks that do not have a finite predefined set of multiple choice answers (\eg\ counting tasks.)
We then implement agents that use our new models to dynamically choose whether to acquire more labels from the crowd or stop, and show that they can produce higher quality labels at a cheaper cost than the state-of-the-art baselines.


In the second part of the dissertation, we shift to tackle the second purpose of data: training.
Because learning algorithms are often robust to noise, instead of optimizing for accuracy like test sets, training sets can make tradeoffs between quantity, accuracy, and diversity. We first investigate the tradeoff between quantity and accuracy, in which given a fixed budget, one can spend money on acquiring cleaner labels, or one can spend money acquiring more examples. We survey how inductive bias, worker accuracy, and budget affect whether a larger and noisier training set or a smaller and cleaner one will train better classifiers. We then set up a formal framework for dynamically choosing the next example to label or relabel by generalizing active learning to allow for relabeling, which we call re-active learning, and we design new algorithms for re-active learning that outperform active learning baselines. Finally, we consider the effect of domain skew on strategies for increasing label diversity in a training set. We introduce an algorithm that dynamically switches between crowd generation and crowd labeling and show that it achieves better performance than state-of-the-art baselines across several domains with varying skew.

When: 2 Jun 2017 - 10:00am Where: CSE 303

Kanit Wongsuphasawat, General Exam

Title:  Mixed-Initiative Visualization Tools 
for Exploratory Data Analysis

Advisor: Jeff Heer

Supervisory Committee: Jeffrey Heer (Chair), Jevin West (GSR, iSchool), Bill Howe, Jock Mackinlay (Tableau)

Abstract
Visual data analysis is an iterative process that involves both open-ended exploration and focused question answering. However, traditional visual analysis tools provide specification interfaces that are suitable only for question answering but are often tedious for exploration. This doctoral thesis investigates the design of visualization tools that augments traditional specification interfaces with automated chart specification to facilitate exploratory analysis. First, we present the design and controlled studies of Voyager, a system that couples faceted browsing with visualization recommendations to promote open-ended exploration. To facilitate both exploration and question answering, we present Voyager 2, a visual analysis tool that blends manual and automated specification in a unified system. To support specifications and recommendations in Voyager and Voyager 2, we also present the Vega-Lite visualization grammar and the CompassQL visualization query language, which defines an organized collection of charts. In current work, we plan to extend our interfaces and query engine to facilitate exploratory analysis of datasets with a larger number of fields.


When: 2 Jun 2017 - 1:30pm Where: CSE 503

James Ferguson, Qualifying Project Presentation

Title: Semi-Supervised Event Extraction with Paraphrase Clusters

Advisors: Hannaneh Hajishirzi and Dan Weld

Abstract:

Supervised event extraction systems are limited in their accuracy due to the lack of available training data. We present a method for self-training event extraction systems by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from various sources. If our system can make a high-confidence extraction of some mentions in a cluster of news articles, it can then leverage those mentions to identify additional occurrences of the event in the cluster. This allows us to collect a large amount of new, diverse training data for a specific event. Using this method, our experiments show an increase of 1.7 F1 points for a near-state-of-the-art baseline extractor on the ACE 2005 dataset.

When: 6 Jun 2017 - 2:00pm Where: CSE 303

Shu Liang, General Exam

Title: Virtualize me: Personalized 3D Head Reconstruction

Advisors: Linda Shapiro and Ira Kemelmacher

Supervisor Committee: Linda Shapiro (co-Chair), Irena Kemelmacher (co-Chair), John Kramlich (GSR, ME), and Brian Curless

Abstract:  Personalized 3D face reconstruction has produced exciting results over the past few years. However, most methods focus solely on the face area and mask out the rest of the head. In this work, we propose a data-driven approach to obtaining a personalized reconstruction of a subject’s full head in the wild. We start with the visual hull shape gotten from a person’s Internet video and fill in the face details using the shading information extracted from the same person’s photo collections. The hairstyle is retrieved from a synthetic hairstyle dataset and deformed to fit the person’s rough hair shape. We will demonstrate the effectiveness of this algorithm on different celebrities using their photos and videos downloaded from the Internet.

When: 7 Jun 2017 - 2:00pm Where: CSE 203

Catherine Baker, Final Exam

Title: Understanding and Improving the Educational Experiences of Blind Programmers

Advisor: Richard Ladner

Supervisory Committee: Richard Ladner (Chair), Jennifer Turns (GSR, HCDE), Katharina Reinecke, Alan Borning, Andy Ko (iSchool)

Abstract: Teaching people with disabilities tech skills empowers them to create solutions to problems they encounter. However, computer science is taught in a highly visual manner which can present barriers for people who are blind. The goal of this dissertation is to understand what those barriers are and to present the projects I have done to decrease those barriers.

The first projects I present look at the barriers that blind students face. I first present the results of my survey and interviews with blind graduates with degrees in computer science or related fields. This work highlights the many barriers that these blind students overcame when they were in college. We then follow-up on one of the barriers mentioned, access to technology, by doing a preliminary accessibility evaluation of six popular integrated development environments (IDEs) and code editors. I found that half were unusable and all have some inaccessible portions.

As access to visual information is one of the barriers in computer science education, I present three projects I have done to increase access to visual information in computer science. The first project is Tactile Graphics with a Voice (TGV). This project investigated an alternative to Braille labels for those that don’t know Braille and showed that TGV was a viable solution. The next project was StructJumper, which creates a modified abstract syntax tree which blind programmers can use to navigate through code. The results of the evaluation show that users could navigate quicker and easily determine the relationships of lines of code. The final project I present is a dynamic graph tool which has two different modes for handling focus changes when moving between graphs. I found that users can use both modes to answer questions about changes in graphs and want access to both modes plus others to be able to select the appropriate mode for the task.

These projects work towards the goal of making computer science education more accessible to blind students. By identifying the barriers that exist and creating solutions to overcome them, we can increase the number of blind students in computer science.

When: 8 Jun 2017 - 10:00am Where: CSE 305

Kenton Lee, General Exam

Title: End-to-end Neural Structured Prediction

Advisor: Luke Zettlemoyer

Supervisor Committee: Luke Zettlemoyer (Chair), Emily Bender (GSR, Linguistics), Yejin Choi, and Noah Smith

Abstract: Many core challenges in natural language understanding are formulated as structured prediction problems, and recently introduced neural modeling techniques have significantly improved performance in nearly every case.  However, these techniques can be challenging to develop, often requiring a trade-off between tractable inference and the use of neural representations for full output structures. We outline two approaches that make different trade-offs, as demonstrated by models for two different tasks. In CCG parsing, an A* parser improves accuracy by recursively modeling entire parse trees while maintaining optimality guarantees. In coreference resolution, where the model must scale to large documents, simpler inference is critical. Instead, an end-to-end span-level model that does not explicitly model clusters can produce state-of-the-art results.

When: 8 Jun 2017 - 3:30pm Where: CSE 503

Eunsol Choi, General Exam

Title: Extracting, Inferring and Applying Knowledge about Entities

Advisors: Luke Zettlemoyer and Yejin Choi

Supervisory Committee: Luke Zettlemoyer (co-Chair), Yejin Choi (co-Chair), Emily M. Bender (GSR, Linguistics), and Dan Weld

Abstract

Real world entities such as people, organizations and countries play a critical role in text. Language contains rich explicit and implicit information about these entities, such as the categories they belong to, relationships to other entities, and events they participate in. To extract such knowledge, models must recognize entities from text and build representation for entities. However, this is challenging: even in a single document, different expressions (i.e., Google, the search giant, the company, it) point to the same entity and often knowledge can only be inferred from context without being explicitly stated.

In this work, we focus on entities to interpret text. Specifically, we introduce two approaches for extracting and inferring knowledge about entities. The first work focuses on entity-entity sentiment relationship, i.e., who feels positively (or negatively) towards whom, a task which requires reasoning about social dynamics of entities as well as targeted sentiment expressions. The second work infers free-form phrases that describe appropriate types for the target entity. For example, consider the sentences ``Bill robbed John. He was arrested." The noun phrases ``John" and ``he" have specific types, such as ``criminal", that can be inferred from context. For this prediction task, we introduce two novel sources of distant supervision: head words of noun phrases found with an automated parser, and types heuristically extracted from Wikipedia pages of linked named entities.

As a future direction, we propose to apply knowledge about entities to improve core NLP tasks such as coreference resolution and question answering. Lastly, we also propose to create summarizations focused on specific entities and examine how entity types affect the way media describes events they participate in.

When: 9 Jun 2017 - 1:00pm Where: CSE 303

Ada Lerner, Final Exam

Title: Measuring and Improving Security and Privacy on the Web: Case Studies with QR Codes, Third-Party Tracking, and Archives

Advisors: Yoshi Kohno and Franzi Roesner

Supervisory Committee: Yoshi Kohno (co-Chair), Franzi Roesner (co-Chair), David McDonald (GSR, HCDE), and Arvind Krishnamurthy

Abstract: 

The web is deeply integrated into the economic, governmental, and social fabrics of our society, and it promises to help fulfill people’s diverse and critical needs easier, cheaper, faster, and more democratically. However, to fulfill these promises, we will need to study the security and privacy implications of the web and its surrounding technologies, and ensure that security and privacy are incorporated into their future evolution. I will present measurement and analysis work from my PhD examining the web and a set of the web’s sister technologies, forming insights and foundations for the ways that we can make the web more secure and privacy preserving. I identify and quantify current, past, and future properties of the web which are critical to its ability to provide security and privacy to its users, and synthesize lessons from these measurements and analyses which aim to guide the future designers and regulators of technologies surrounding the web in ensuring that it serves the needs of all who use it. Specifically, I will present longitudinal measurements of third-party web tracking using a novel measurement technique, as well as analyses of the security of web archives against malicious manipulation and of the global use of QR codes.

When: 9 Jun 2017 - 2:00pm Where: CSE 203

Hanchuan Li, General Examination

Title: Enabling Novel Sensing and Interaction with Everyday Objects using Commercial RFID Systems

Advisor: Shwetak Patel

Supervisory Committee: Shwetak Patel (Chair), Joan Sanders (GSR, Bioengineering), Josh Smith, Alanson Sample (Disney Research)

Abstract: 

The Internet of Things (IoT) promises an interconnected network of smart devices that will revolutionize the way people interact with their surrounding environments. This distributed network of physical devices will open up tremendous opportunities for Human-Computer Interaction and Ubiquitous Computing, creating novel user-centered and context-aware sensing applications.

The advancement of IoT has been heavily focused on creating new and smart electronic devices, while the vast majority of everyday non-smart objects are left unchecked. There currently exists a huge gap between the collection of devices integrated to the IoT and the remaining massive number of everyday objects that people interact with in their daily living.

Radio-frequency Identification (RFID) has been widely adopted in the IoT industry as a standardized infrastructure. In this thesis proposal, I apply signal processing and machine learning techniques to low-level channel parameters of commercially available RFID tags to enable novel sensing and interaction with everyday objects.These new sensing capabilities allow for our system to recognize fine-grain daily activities, create tangible passive input devices, and enhance user experiences in human-robot interactions.

When: 13 Jun 2017 - 12:30pm Where: CSE 305

Supasorn Suwajanakorn, Final Exam

Title: Audiovisual Persona Reconstruction

Supervisor Committee: Steven Seitz (co-Chair), Irena Kemelmaher (co-Chair), Duane Storti (GSR, ME), Richard Szeliski (Facebook)

Advisors: Steven Seitz and Irena Kemelmaher

Abstract:

How can Tom Hanks come across so differently in "Forrest Gump" and "Catch Me If You Can?" What makes him unique in each of his roles? Is it his appearance? The way he talks? The way he moves? In my thesis, I introduce the problem of persona reconstruction. I define it as a modeling process that accurately represents the likeness of a person, and propose solutions to address the problem with the goal of creating a model that looks, talks, and acts like the recorded person. The specific aspects of persona modelled in this thesis include facial shape and appearance, motion and expression dynamics, the aging process, the speaking style and how a person talks through solving the visual speech synthesis problem.
 
These goals are highly ambitious. The key idea of this thesis is that the utilization of a large amount of unconstrained data enables overcoming many of the challenges. Unlike most traditional modeling techniques which require a sophisticated capturing process, my solutions to these tasks operate only on unconstrained data such as a uncalibrated personal photo and video collection, and thus can be scaled to virtually anyone, even historical figures, with minimal efforts.
 
In particular, I first propose new techniques to reconstruct time-varying facial geometry equipped with expression-dependent texture that captures even minute shape variations such as wrinkles and creases using a combination of uncalibrated photometric stereo, novel 3D optical flow, dense pixel-level face alignment, and frequency-based image blending. Then I demonstrate a way to drive or animate the reconstructed model with a source video of another actor by transferring the expression dynamics while preserving the likeness of the person. Together these techniques represent the first system that allows reconstruction of a controllable 3D model of any person from just a photo collection. 
 
Next, I model facial changes due to aging by learning the aging transformation from unstructured Internet photos using a novel illumination-subspace matching technique. Then I apply such a transformation in an application that takes as input a photograph of a child and produces a series of age-progressed outputs between 1 and 80 years of age. The proposed technique establishes a new state of the art for the most difficult aging case of babies to adults. This is demonstrated by an extensive evaluation of age progression techniques in the literature.
 
Finally, I model how a person talks via a system that can synthesize a realistic video of a person speaking given just an input audio. Unlike prior work which requires a carefully constructed speech database from many individuals, my solution solves the video speech problem by requiring only existing video footage of a single person. Specifically, it focuses on a single person (Barack Obama) and relies on an LSTM-based recurrent neural network trained on Obama's footage to synthesize a high-quality mouth video of him speaking. My approach generates compelling and believable videos from audio that enable a range of important applications such as lip-reading for hearing-impaired people, video bandwidth reduction, and creating digital humans which are central to entertainment applications like special effects in films.
When: 20 Jun 2017 - 12:00pm Where: CSE 203

Travis Mandel, Final Exam

Title: Better Education Through Improved Reinforcement Learning

Advisors: Zoran Popović and Emma Brunskill (Stanford)

Supervisory Committee: Zoran Popović (co-Chair), Emma Brunskill (co-Chair, Stanford), Mari Ostendorf (GSR, EE), and Dan Weld.

Abstract: When a new student comes to play an educational game, how can we determine what content to give them such that they learn as much as possible?  When a frustrated customer calls in to a helpline, how can we determine what to say to best assist them?  When a ill patient comes in to the clinic, how do we determine what tests to run and treatments to give to maximize their quality of life?

These problems, though diverse, are all a seemingly natural choice for reinforcement learning, where an AI agent learns from experience how to make a sequence of decisions to maximize some reward signal.  However, unlike many recent successes of reinforcement learning, in these settings the agent gains experience solely by interacting with humans (e.g. game players or patients). As a result, although the potential to directly impact human lives is much greater, intervening to collect new data is often expensive and potentially risky .  Therefore, in this talk I present methods that allow us to evaluate candidate learning approaches offline using previously-collected data instead of actually deploying them. Further, I present new learning algorithms which ensure that, when we do choose to deploy, the data we gather is maximally useful. Finally, I explore how reinforcement learning agents should best leverage human expertise to gradually extend the capabilities of the system, a topic which lies in the exciting area of Human-in-the-Loop AI. 

Throughout the talk I will discuss how I have deployed real-world experiments and used data from thousands of kids to demonstrate the effectiveness of our techniques on several educational games.

When: 5 Jul 2017 - 12:00pm Where: CSE 403

Raymond Cheng, Final Exam

Title: Security and Privacy for Cloud Applications

Advisors: Tom Anderson and Arvind Krishnamurthy

Supervisory Committee: Tom Anderson (Co-Chair), Arvind Krishnamurthy (Co-Chair), Raadhakrishnan Poovendran (GSR, EE), Yoshi Kohno, and Jon Howell (MSR)

Abstract: TBA

When: 10 Aug 2017 - 2:00pm Where: CSE 305