## Julian Michael, Qualifying Project Presentation

Title: Crowdsourcing Semantic Structure

Advisors: Luke Zettlemoyer and Yejin Choi

Abstract: End-to-end neural models have pushed the state-of-the-art on many natural language understanding tasks, but they are generally data-hungry and prone to overfitting. Efforts to fix this often involve using task-general semantic structures as intermediate representations, but gains from this are rare because the models to produce the intermediates often suffer from the same problems of generalization and efficiency.

Building effective models for semantic intermediates is especially difficult because annotating the training data is expensive and time-consuming. As a result, data annotated with semantics is often much smaller than the available data for end tasks, limiting its usefulness. However, recent work has explored creative methods to scale up semantic annotation with non-experts to cheaply and rapidly annotate linguistic structure.

In this talk, I describe crowdsourcing projects on three such approaches: Human-in-the-loop Parsing, Question-Answer Meaning Representation, and Question-Answer Driven Semantic Role Labeling. I outline the key lessons that have emerged from this research and a trajectory for future work on crowdsourcing semantic structure.
When: 27 Feb 2018 - 2:30pm Where: CSE 303

## Phoebe Mulcaire, Qualifying Project Presentation

Title: Polyglot Semantic Role Labeling

Advisors: Noah Smith and Luke Zettlemoyer

Abstract: Previous approaches to multilingual semantic dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. We experiment with a new approach where we combine resources from different languages in the CoNLL 2009 shared task (Hajič et al., 2009) to build a single polyglot semantic dependency parser. Notwithstanding the absence of parallel data, and the dissimilarity in annotations between languages, our approach results in improvement in parsing performance on several languages over a monolingual baseline. Analysis of the polyglot models' performance provides a new understanding of the similarities and differences between languages in the shared task.

When: 26 Feb 2018 - 11:00am Where: CSE 303

## John C. Earls, General Examination

Title: Quantifying wellness and identifying disease transitions with personal, dense, dynamic, data clouds

Advisors: Larry Ruzzo and Nathan Price

Supervisory Committee: Larry Ruzzo (Co-Chair), Nathan Price (Co-Chair, BioE), Daniela Witten (GSR, STAT), and Ed Lazowska

Abstract:

Personal, dense, dynamic, data clouds (pD3 clouds); where diverse measures from genetics, protemics, clinical labs, metabolomics, microbiome, and personal health devices are taken longitudinally in large cohorts; open a window to individual health that has not existed on this scale before. Computational analysis of such rich health data on individuals presents the opportunity to build a 21st century learning healthcare system, that focuses not only on treating disease, but on promoting wellness and the prevention of disease.

Three major challenges presented by this unprecedented wealth of personal health data is the development of measures and techniques for: 1) understanding an individual's current wellness status; 2) personalized prediction of the effect of interventions on that wellness status; and 3) identifying early transitions to disease states, when that disease is most treatable. My work is addressing these challenges through application of statistical modeling and machine learning to pD3 clouds.

This thesis proposal will present my work in the larger context of Precision (P4) Medicine; discuss prior attempts to use "biological age" as a holistic wellness measure and my current progress developing this measure using pD3 clouds; and present a new technique for developing systems biomarkers to identify the earliest signs of disease transitions. I will demonstrate the feasibility of this proposed thesis by presenting preliminary work using pD3 clouds shared for research purposes by thousands of individuals enrolled in the Arivale program, a pD3 cloud based wellness start-up spun from the Institute for Systems Biology.

When: 26 Feb 2018 - 9:00am Where: CSE 303

## Steven Lyubomirsky, Qualifying Project Presentation

Title: Design and Evaluation of Incrementalization in SpaceSearch

Advisors: Zach Tatlock and Michael Ernst

Abstract: SpaceSearch is a library for building solver-aided verification tools within a proof assistant, allowing a user to verify the tools according to semantics defined in Coq and, using Coq's extraction facilities, produce an implementation of this solver-aided tool in Rosette. This allows a user to formally prove the correctness of the domain-specific reductions and optimizations that are commonly used to reduce the search space for a solver-aided tool and make more efficient use of the SMT solver's capabilities. In this talk, I will discuss my contributions to the design and evaluation of SpaceSearch, particularly the semantics and implementation of incrementalization, which allows for tools produced by SpaceSearch to avoid searching portions of a space that they had previously searched. I will also discuss design challenges when using SpaceSearch and possible areas of further exploration, as in the case of attempting to incorporate SpaceSearch's incrementalization feature into Bagpipe, a solver-aided BGP configuration checker built and verified in SpaceSearch.

When: 21 Feb 2018 - 1:30pm Where: CSE 678

## Jeong Joon Park, Qualifying Project Presentation

Title: Surface Light Field Fusion: High Fidelity 3D Reconstruction

Advisors: Steve Seitz, Richard Newcombe, and Brian Curless

Abstract: We present an approach for interactively scanning highly reflective objects with a commodity RGBD sensor. In addition to shape, our approach models the surface light field, encoding scene appearance from all directions. By factoring the surface light field into view-independent and wavelength-independent components, we arrive at a representation that can be robustly estimated with IR-equipped commodity depth sensors, and achieves high quality results.

When: 16 Feb 2018 - 10:30am Where: CSE 403

## Ben Jones, Qualifying Project Presentation

Title: Computationally Augmenting Algebraic Exploration

Advisors: Steve Tanimoto and Emina Torlak

Abstract: Algebra is an example of a class of problem solving where the state space of possible solutions is infinite. Computer algebra systems offer powerful tools for solving these problems when the goal state is known, but they give little help when working in an exploratory mode. In these situations, hand manipulation is still the dominant mode of exploration. We explore using search over solution graphs to augment handwritten algebra, and along the way develop a framework for problem solving on infinite search spaces.

When: 15 Feb 2018 - 10:00am Where: CSE 678

## Elizabeth Clark, Qualifying Project Presentation

Title: Writing Stories with a Machine in the Loop

Advisors: Noah Smith and Yejin Choi

Abstract: Machine-in-the-loop creative writing systems offer suggestions to writers with the goal of sparking creativity and alleviating writer's block. We performed a user study that asked people to write stories with a machine-in-the-loop system to investigate what people want in the suggestions they receive and in their interaction with the machine. In this talk, I present the results of the user study and how it motivates directions for improving machine-in-the-loop creative writing systems. These directions include improvements to the structure of the interaction and to the models providing suggestions to writers. In particular, I will focus on a model that uses entities as context to generate text that shows promise for a collaborative story writing setting.

When: 13 Feb 2018 - 10:00am Where: CSE 403

## Kelvin Luu, Qualifying Project Presentation

Title: TBA

Advisors: Noah Smith and Mari Ostendorf (EE)

Abstract: We present a study for amateur debates online. While most studies on debates focus on more expert, televised events, there is a gap in characterizing debates in lesser skilled debaters. We introduce a new dataset and find that there are significant differences in how expert debaters frame their own and their opponent's main ideas from their less skilled counterparts.

When: 7 Feb 2018 - 9:30am Where: CSE 303

## Vincent Lee, General Examination

Title: Taming Specialized Computing and Emerging Technologies with Hardware and Application Codesign

Supervisory Committee: Luis Ceze (Chair), Visvesh Sathe (GSR, EE), Mark Oskin, and Ras Bodik

Abstract: The recent collapse of Dennard scaling has left behind a performance and energy gap which has accelerated the pace towards specialized computing technologies and alternative computing techniques. It is no longer the case that reducing transistor sizes yields the same performance and energy efficiency gains that were commonplace a decade or two ago. At the same time, advances in data mining, computer vision, machine learning, and artificial intelligence relentlessly continue to demand increasing amounts of computation and energy resources. This has culminated into an unprecedented diversification of the computing landscape from GPGPUs, field programmable gate arrays, to even full custom ASICs to sustain these computational demands. While new technologies and computing platforms promise compelling performance and energy efficiency benefits, they present their own unique design challenges and idiosyncratic design constraints.

My work focuses on taming these idiosyncratic design challenges for specialized computing substrates and emerging technologies, and proposes judicious hardware and application codesign to maximize the benefits of specialization efforts. I define hardware and application codesign as the process of modifying the application and hardware synergistically to improve the overall performance, energy efficiency, computational density, or accuracy. I will demonstrate how to translate codesign insights into these improvements on top of specialized computing platforms across three regimes: (1) application codesign for near-data processing, (2) codesigning emerging applications with stochastic computation, and (3) automating the codesign processing for stochastic computing and coarse-grained reconfigurable arrays.

When: 5 Feb 2018 - 3:00pm Where: CSE 303

## Samia Ibtasam, Qualifying Project Presentation

Title: Digital Financial Services and the Impacts of Gender and Learnability on Adoption

Advisors: Richard Anderson and Kurtis Heimerl

Abstract:

Worldwide, two billion people remain unbanked, the majority of whom reside in resource-constrained environments. While banks have limited reach due to high overhead costs of physical expansion, the global increase in mobile penetration has created opportunities to serve the unbanked using mobile-based Digital Financial Services (DFS). However, there are many factors that affect the understanding, adoption, and use of DFS including learnability of the systems and gender of the users.

In this talk, we discuss ways in which previous exposure or domain knowledge improve learnability, and we recommend that metrics for learnability should include effectiveness and help sought, independent of usability. We also identify DFS adoption opportunities such as user readiness, interface improvements, and women's independence. We also discuss the role of gendered barriers in the readiness for and adoption of Digital Financial Services (DFS) and the impact of gendered roles in curtailing or enhancing the same. We present our analysis of the affordances or, lack thereof, in affordability of funds, the authority of transactions, access to technological devices, and agency of social and cultural mobility--all of which are prerequisites to fully utilizing DFS.

When: 30 Jan 2018 - 12:00pm Where: EEB 003

## Maarten Sap, Qualifying Project Presentation

Title: Connotation Frames of Power and Agency in Modern Films

Advisors: Noah Smith and Yejin Choi

Abstract: The framing of an action influences how we perceive its actor. We introduce connotation frames of power and agency, a pragmatic formalism organized using frame semantic representations, to model how different levels of power and agency are implicitly projected on actors through their actions. We use the new power and agency frames to measure the subtle, but prevalent, gender bias in the portrayal of modern film characters and provide insights that deviate from the well-known Bechdel test. Our contributions include an extended lexicon of connotation frames along with a web interface that provides a comprehensive analysis through the lens of connotation frames.

When: 25 Jan 2018 - 10:30am Where: CSE 403

## Rowan Zellers, Qualifying Project Presentation

Title: Zero-Shot Activity Recognition with Verb Attribute Induction and Neural Motifs: Scene Graph Parsing with Global Context

Abstract: In this talk, I will discuss two projects in the intersection of Computer Vision and Natural Language Processing. (1) Zero-shot multimodal learning where the topics of classification are verbs, not objects. We crowdsource verb attributes and build a model to learn them from text (word embeddings and dictionary definitions). We also use these verb attributes alongside word embeddings for activity recognition in images. (2) Scene graph generation: building a semantic graph of an image where the nodes are objects and the edges are pairwise relationships. We find that the Visual Genome dataset has many repeating structures, or motifs, and build a state-of-the-art model that captures these.

When: 24 Jan 2018 - 1:30pm Where: CSE 303

## Mark Wyse, Qualifying Project Presentation

Title: Understanding GPGPU Vector Register File Usage

Advisors: Mark Oskin and Luis Ceze

Abstract: Graphics processing units (GPUs) have emerged as a favored compute accelerator for workstations, servers, and supercomputers. At their core, GPUs are massively-multithreaded compute engines, capable of concurrently supporting over one hundred thousand active threads. Supporting this many threads requires storing context for every thread on-chip, and results in large vector register files consuming a significant amount of die area and power. Thus, it is imperative that the vast number of registers are used effectively, efficiently, and to maximal benefit. This work evaluates the usage of the vector register file in a modern GPGPU architecture. We confirm the results of prior studies, showing vector registers are reused in small windows by few consumers and that vector registers are a key limiter of workgroup dispatch. We then evaluate the effectiveness of previously proposed techniques at reusing register values and hiding bank access conflict penalties. Lastly, we study the performance impact of introducing additional vector registers and show that additional parallelism is not always beneficial, somewhat counter-intuitive to the “more threads, better throughput” view of GPGPU acceleration.

When: 24 Jan 2018 - 1:00pm Where: CSE 615

## Lucy Lin, Qualifying Project Presentation

Title: Semantic Measurement of Ideas in Text Corpora

Advisors: Noah Smith and Yejin Choi

Abstract: In this talk, we introduce the problem of semantic measurement: estimating the frequency of a natural language-expressible proposition in a corpus.  We describe a two-stage process involving (1) a fast, high-recall semantic function to filter out unlikely sentences, and (2) a more expensive syntax-based model to score the filtered sentences.  To evaluate our approach, we conduct two case studies: one on framing of policy issues in media and the other in the domain of natural disaster recovery.  Our findings validate our design decisions and motivate future work on semantic measurement applications.

When: 24 Jan 2018 - 10:00am Where: CSE 303

## Jialin Li, General Examination

Title: Rethinking Distributed Caching with Emerging Programmable Networking Hardware Support

Supervisory Committee: Dan Ports (Chair), Sreeram Kannan (GSR, EE), Arvind Krishnamurthy, and Magda Balazinska

Abstract:

Caching frequently used data is a fundamental technique to improve memory subsystem performance. Large scale web services today rely heavily on in-memory caching to reduce both client request latency and load on backend databases. Designing a distributed caching systems for these services however present several challenges. First, the system needs to cope with unpredictable changes in access patterns, especially with sudden surge in demand for certain popular items. Second, data distributed across cache servers and backend databases need to be kept consistent, even in the presence of failures in the system.

At the same time, a new class of networking hardware platform, including programmable switch ASICs, reconfigurable network accelerators, and SmartNICs, is starting to emerge. They enable datapath and application-level acceleration by pushing computation directly into the network, presenting opportunities to re-architect distributed caching systems.

In this work, we first review existing caching and cache-coherence techniques used in uniprocessors, multiprocessors, distributed shared memory, and distributed in-memory caches. We then examine several emerging networking hardware, and survey some of their use cases. Lastly, we will propose new distributed caching system designs that exploit programmable networking hardware to accelerate the caching and cache-coherence protocols.

When: 18 Jan 2018 - 10:00am Where: CSE 305

## Abram Friesen, Final Examination

Title: The Sum-Product Theorem and its Applications

Supervisory Committee: Pedro Domingos (Chair), Jeffrey Bilmes (GSR, EE), Carlos Guestrin, and Henry Kautz (University of Rochester)

Abstract:

Models in artificial intelligence (AI) and machine learning (ML) must be expressive enough to accurately capture the state of the world, but tractable enough that reasoning and inference within them is feasible. However, many standard models are incapable of capturing sufficiently complex phenomena when constrained to be tractable. In this work, I study the cause of this inexpressiveness and its relationship to inference complexity. I use the resulting insights to develop more efficient and expressive models and algorithms for many problems in AI and ML, including nonconvex optimization, computer vision, and deep learning.

When: 12 Dec 2017 - 2:00pm Where: CSE 305

## Kenton Lee, Final Examination

Title: Span-based Neural Structured Prediction

Supervisory Committee: Luke Zettlemoyer (Chair), Mari Ostendorf (GSR, EE), Yejin Choi, and Noah Smith

Abstract:

A long-standing goal in artificial intelligence is for machines to understand natural language. With ever-growing amounts of data in the world, it is crucial to automate many aspects of language understanding so that users can make sense of this data in the face of information overload. The main challenge stems from the fact that language, either in the form of speech or text, is inherently unstructured. Without programmatic access to the semantics of natural language, it is challenging to build general, robust systems that are usable in practice.

Towards achieving this goal, we propose a series of neural structured-prediction algorithms for natural language processing. In particular, we address a challenge common to all such algorithms: the space of possible output structures can be extremely large, and inference in this space can be intractable. Despite the seeming incompatibility of neural representations with dynamic programs from traditional structured prediction algorithms, we can leverage these rich representations to learn more accurate models while using simpler or lazier inference.

We focus on inference algorithms over the most basic substructure of language: spans of text. We present state-of-the-art models for tasks that require modeling the internal structure of spans, such as syntactic parsing, and modeling structure between spans, such as question-answering and coreference resolution. The proposed techniques are applicable to many structured prediction problems and we expect that they will further push the limits of neural structured prediction for natural language processing.

When: 6 Dec 2017 - 11:30am Where: CSE 403

## Ignacio Cano, General Examination

Title: Optimizing Systems using Machine Learning

Supervisory Committee: Arvind Krishnamurthy (Chair), Radha Poovendran (GSR, EE), Xi Wang, and Kevin Jamieson

Abstract: Computer systems comprise of many components that interact and cooperate with each other to perform certain task(s). Traditionally, many of these systems base their decisions on sets of rules or configurations defined by operators as well as handcrafted analytical models. However, creating those rules or engineering such models is a challenging task. First, the same system should be able to work under a combinatorial number of constraints on top of heterogeneous hardware. Second, they should support different type of workloads and run in potentially widely different settings. Third, they should be able to handle time-varying resource needs. These factors render reasoning about systems performance in general far from trivial.

In this thesis, we propose optimizing systems using Machine Learning techniques. By doing so, we aim to offload the burden of manually tuning rules and handcrafting complex analytical models from system designers, in order to bridge the gap of systems performance, and promote a generation of smarter systems that can learn from past experiences and improve their performance over time. In this talk, we present two systems that illustrate the impact of these ML-based optimizations.

First, we introduce ADARES, an adaptive system that dynamically adjusts virtual machine resources on-the-fly, namely virtual CPUs and memory, based on workload characteristics and other attributes of the virtualized environment, using Reinforcement Learning techniques. Then, we present CURATOR, a MapReduce-based framework for storage systems that safeguards the storage health and performance by executing background maintenance tasks, which we also schedule using RL.

Throughout this thesis, we present the instantiation of different ML models and (empirically) show how our formulations result in improved systems performance and efficiency. We propose pre-initializing model-free learners with historical traces to accelerate training, thus reducing the sample complexity. Our models can cope with heterogeneity in workloads, settings, and resources, as well as adapt to non-stationary dynamics.

When: 4 Dec 2017 - 3:00pm Where: CSE 303

## Connor Schenck, General Examination

Title: Deep Liquid: Combining Liquid Simulation and Perception Using Deep Neural Networks

Supervisory Committee: Dieter Fox (Chair), Samuel Burden (GSR, EE), Maya Cakmak, and Sidd Srinivasa

Abstract: Liquids are an essential part of many common tasks. While humans seem to be able to master them to varying degrees from a young age, this is still a challenge for robots. Here I propose a system to enable robots to handle liquids. I propose a Bayes' filter for liquids that combines deep neural networks with liquid simulation. The deep networks provide the dynamics and observation models of the filter while the simulator provides the training data. This combines both the generality of the simulator with the adaptability of the deep networks. The proposed method has the potential to make a significant contribution to the body of robotics research.

When: 4 Dec 2017 - 12:00pm Where: CSE 403

## Alex Takakuwa, General Examination

Title: Moving from passwords to authenticators

Supervisory Committee: Yoshi Kohno (Chair), Edward Mack (GSR, Asian Languages & Literature), Alexei Czeskis (Google), and Franzi Roesner

Abstract:

Humans have used passwords for access control since ancient times.  Upon the advent of the internet, passwords naturally transitioned to the web and have since become the standard mode of web authentication. However, over the last 25 years, certain issues with password authentication have proven to be unavoidable security and usability problems. Many within the computer security industry believe that we can improve the state of the art in both security and usability by utilizing asymmetric challenge-response protocols for authentication. For example, the FIDO Alliance, a group of industry and academic partners working together to bring secure and usable authentication protocols to the web, utilize such asymmetric cryptographic protocols to help strengthen the authentication flow. However, despite industry and academic desire to improve web authentication, passwords remain the status quo for users. In this dissertation proposal, I present the landscape of authentication protocols and attempt to solve some of the remaining technical challenges that prevent modern authentication schemes from supplanting passwords as the dominant method of web authentication.

When: 1 Dec 2017 - 2:00pm Where: CSE 203

## Gagan Bansal, Qualifying Project Presentation

Title: A Coverage-Based Utility Model for Identifying Unknown Unknowns

Advisors: Dan Weld and James Fogarty

Abstract: A classifier’s low confidence in prediction is often indicative of whether its prediction will be wrong; in this case, inputs are called known unknowns. In contrast, unknown unknowns (UUs) are inputs on which a classifier makes a high confidence mistake. Identifying UUs is especially important in safety-critical domains like medicine (diagnosis) and law (recidivism prediction). Previous work by Lakkaraju et al. (2017) on identifying unknown unknowns assumes that the utility of each revealed UU is independent of the others rather than considering the set as a whole. While this assumption yields an efficient discovery algorithm, we argue that it produces an incomplete understanding of the classifier’s limitations. In response, this work proposes a new class of utility models that rewards how well the discovered UUs cover (or “explain”) a sample distribution of expected queries. Although choosing an optimal cover is intractable, even if the UUs were known, our utility model is monotone submodular, affording a greedy discovery strategy. Experimental results on four datasets show that our method outperforms bandit-based approaches and achieves within 60.9% utility of an omniscient, tractable upper bound.

When: 29 Nov 2017 - 11:00am Where: CSE 203

## Aaron Bauer, General Examination

Title: Understanding Problem-Solving and Collaboration in an Open-Ended Scientific-Discovery Game

Supervisory Committee: Zoran Popovic (Chair), Andy Ko (GSR, iSchool), Steve Tanimoto, and Dan Weld

Abstract: Countless human pursuits depend upon creative problem solving, especially in complex, open-ended domains. As part of the growing technological support for such problem solving in digital environments, an opportunity exists to advance the understanding and design of these systems. The vast increase in the scale and richness of data on problem-solving behavior afforded by digital environments opens up new avenues for the study of this behavior. Digital environments also offer the possibility of building systems that actively guide and facilitate individual and collaborative problem solving toward the most productive outcomes. These systems could scaffold effective solving strategies for novices, intervene in the solving process to suggest areas of focus, or take the form of layers of machine intelligence that schedule individual and group work and dynamically adapt environmental parameters to achieve increased solution quality.

Few of these innovations will be possible, however, without a deep understanding of the problem-solving process in the domain of interest. Such an understanding would need to address the full space of strategies solvers employ, how they fit together and change over time, and how they contribute to both success and failure. In this presentation, I propose to develop one component of a deep understanding of the problem-solving process in the scientific-discovery game Foldit by addressing the question what makes groups and individuals successful problem solvers in Foldit? As solvers in Foldit are tasked with real-world, open-ended problems without known solutions, it is highly-suitable domain in which to study complex problem-solving behavior. I present my overall approach to understanding problem-solving in Foldit based on analysis of visual representations of the problem-solving process and exploration of the features of individual and group behavior associated with high performance. I show that patterns of strategic behavior can be identified in these representations and discuss how the visualizations and other analysis can be further developed to capture the full strategic picture of solving behavior.

When: 29 Nov 2017 - 9:00am Where: CSE 203

## Haichen Shen, General Examination

Title: Efficient Neural Network Serving System

Advisor: Arvind Krishnamurthy and Matthai Philipose

Supervisory Committee: Arvind Krishnamurthy (Co-Chair), Matthai Philipose (Co-Chair, MSR), Raadhakrishnan Poovendran (GSR, EE), Tom Anderson, and Ali Farhadi

Abstract:

Today, Deep Neural Networks (DNNs) can recognize faces, detect objects, and transcribe speech, with (almost) human performance. We expect that DNN-based applications will soon become an important workload. Despite of their excellent performance, the computational demands of DNN workloads are high enough that such applications strain device battery, cloud cost budgets, and latency requirement. Existing techniques such as model optimization and batching, though seem promising, have to make compromises in the real-world deployment. Model optimization trades off accuracy for lower computation demand. Batching, on the other hand, cannot achieve optimal performance due to low latency requirement and increased model variations caused by model optimization.

In this proposal, we demonstrate that it is possible to build end-to-end systems that execute DNNs efficiently at low cost, low latency, and high accuracy. To achieve this, we leverage the existing techniques including model optimizations and batching, and propose three systems to improve the efficiency. Sequential specialization takes advantage of temporal locality in the streaming settings and produces low-cost and high-accuracy DNN models at runtime. MCDNN describes an approximation-based execution framework across mobile devices and cloud that achieves high accuracy under resource constraints. It uses a heuristic scheduler that allocates resources proportionally to their frequency of use and systematically trades off accuracy for resource use. Third, we present Nexus, a DNN execution service optimized for GPU clusters. Nexus models DNN as networks of dense linear algebra operations and adopts a batching-aware cluster resource allocation and scheduling framework to improve GPU utilization. We evaluate all three systems under realistic workloads, and results show that these systems can achieve significant improvement in cost, accuracy, and utilization compared to baselines.

When: 28 Nov 2017 - 2:30pm Where: CSE 203

## Xin Yang, Qualifying Project Presentation

Title: Time-Space Tradeoffs for Learning from Small Test Spaces: Learning Low Degree Polynomial Functions

Advisors: Paul Beame and Shayan Oveis Gharan

Abstract: We develop an extension of recent analytic methods for obtaining time-space tradeoff lower bounds for problems of learning from uniform random test samples. With our methods we can obtain bounds for general finite learning problems in which the space of tests can be significantly smaller than the space of inputs and the feedback from each test can come from an arbitrary finite set.

As applications that follow from our new techniques, we show that any algorithm that learns m-variate polynomial functions of degree at most d over F_2 with success at least 2^O(m) from evaluations on randomly chosen inputs either requires space \Omega(mn/d) or 2^{\Omega(m/d)} time where n = (m/d)^{\theta(d)} is the dimension of the space of such polynomials. We also extend these results to learning linear and quadratic polynomials over any odd prime field F_p.
When: 28 Nov 2017 - 1:30pm Where: CSE 303

## Christopher Clark, Qualifying Project Presentation

Title: Simple and Effective Multi-Paragraph Reading Comprehension

Advisors: Luke Zettlemoyer and Matt Gardner (AI2)

Abstract: Recent work has shown that deep learning models can be used to answers questions by reading a related paragraph. In this talk we consider the task of extending these models to work when entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a shared-normalization training objective that encourages the model to produce globally correct output. We combine this method with a state-of-the-art pipeline for training models on document QA data. Experiments demonstrate strong performance on several document QA datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion of TriviaQA, a large improvement from the 56.7 F1 of the previous best system. A demo of our system is available at documentqa.allenai.org.

When: 27 Nov 2017 - 2:00pm Where: CSE 203

## Johannes Linder, Qualifying Project Presentation

Title: Deep Learning of millions of randomized APA variants

Advisors: Georg Seelig and Larry Ruzzo

Abstract: Interpreting the gene-regulatory code is integral for predicting the impact of mutations and for engineering synthetic constructs. With recent advances in large-scale genetic data collection, researchers are studying regulatory mechanisms more and more in a data-driven way, employing state-of-the-art Neural Networks to learn about them. At the same time, new methods for interpreting Neural Nets are constantly being developed, mainly in the Image recognition community. This paper investigates the regulation of Alternative Polyadenylation (APA), a post-transcriptional mechanism that is linked to both transcriptome variability and genetic disease. Using millions of synthetically engineered randomized gene variants, a high-performing Neural Network is trained to predict APA site selection and cleavage specificity. The network is capable of inferring mutational impacts and when fine-tuned on a small number of naturally occurring samples, it can predict APA isoform levels in the human genome. Finally, by adapting visualization techniques to the Sequence-based architecture, the regulatory code learned by the network is deciphered.

When: 27 Nov 2017 - 11:00am Where: Voyager Room - CSE 503

## Qiao Zhang, General Examination

Title: Failure Diagnosis for Datacenter Applications

Advisors: Tom Anderson and Arvind Krishnamurthy

Supervisory Committee: Tom Anderson (Co-Chair), Arvind Krishnamurthy (Co-Chair), Sreeram Kannan (GSR, EE), and Xi Wang

Abstract:

Fast and accurate failure diagnosis remains a major challenge for datacenter operators. Current datacenter applications are increasingly architected around loosely-coupled modular components: each component can scale and evolve independently and achieve higher performance and reliability in the common case. However, when failures occur, datacenter operators may face significant challenges in diagnosing and locating those failures. First, component dependency can be complex and sometimes not well accounted for in failure diagnosis. Second, components can exhibit gray failures that confound failure diagnosis. Third, opaque routing in the network can further hinder failure diagnosis.

My thesis is that fast and accurate failure diagnosis for datacenter applications is possible using three key ideas:
(1) a global view of component interactions and dependencies,
(2) a penalized-regression-based algorithm that can localize both fail-stop and gray failures,
(3) a predictable network architecture that allows for failure localization despite evolving in-network features.

I present preliminary results for two complementary systems that demonstrate this. The first, Deepview, is a system that can diagnose virtual hard disk (VHD) failures in Infrastructure-as-a-Service (IaaS) clouds. Deepview composes a global view of the IaaS stack, and uses an algorithm which integrates Lasso regression and hypothesis testing to localize component failures. Deepview is already deployed at one of the largest IaaS providers, and is shown to localize VHD failures to compute, storage and network components in an accurate and timely fashion. The second, Volur, is a prototype predictable network architecture. It complements Deepview by proposing a design principle for datacenter network that makes in-network routing predictable to the end-hosts for failure diagnosis and load balancing. In simulation and testbed experiments, Volur is shown to accurately localize non-fail-stop link or switch failures as well as to recover from them quickly.

When: 21 Nov 2017 - 11:00am Where: CSE 203

## Yao Lu, General Examination

Title: Building and accelerating a declarative platform for cloud machine learning

Supervisory Committee: Linda Shapiro (Chair), Mark Ganter (GSR, ME), Srikanth Kandula (Microsoft Research), and Magda Balazinska

Abstract:

Machine learning over big-data is blooming with numerous applications including traffic video analytics, internet user modeling and fraud detection. Processing machine learning queries on previous cloud platforms is yet unsatisfactory; manual and ad-hoc tuning is required for degrees of parallelism, resource usage, and algorithmic optimizations.

We describe Optasia, a novel dataflow system that efficiently processes large-scale machine learning queries on the cloud. Optasia maps machine learning operations to a SQL-like declarative language and applies a powerful query optimizer for auto parallelization and multi-query de-duplication. Evaluation on a traffic video dataset shows many-fold improvements in query completion time and resource usage relative to existing systems. To further accelerate the query processing, we note that, machine learning algorithms are modeled as user-defined functions (UDFs) in current dataflow systems. Predicates on the UDF outputs (e.g., “has-red-cars=true”), even if highly selective, force a large amount of data to be needlessly analyzed. Accelerating machine learning queries with UDFs is yet an open question. In this work, we review and extend predicate pushdown by constructing and applying probabilistic predicates (PPs) to filter data blobs that do not satisfy the query predicate; such filtering is parametrized to different target accuracies. To support complex query predicates, we augment a cost-based query optimizer to choose plans with appropriate combinations of simpler probabilistic predicates. Experiments on several practical machine learning workloads show that query processing can be boosted by as much as 10x without losing accuracy.

When: 21 Nov 2017 - 9:00am Where: CSE 303

## Helga Guðmundsdóttir, Qualifying Project Presentation

Title: Interactive Analysis of Performance Measurements with Viska

Advisors: Magda Balazinska and Dan Ports

Abstract: The ultimate goal of system performance analysis is to identify the underlying causes for performance differences between different systems and different workloads. We make this goal easier to achieve with Viska, a new tool for generating and interpreting performance measurement results. Viska leverages cutting-edge techniques from big data analytics and data visualization to aid and automate this analysis, and helps users derive meaningful and statistically sound conclusions using state-of-the-art causal inference and hypothesis testing techniques.

When: 20 Nov 2017 - 2:30pm Where: CSE 405

## Tomer Kaftan, Qualifying Project Presentation

Title: Cuttlefish: A Lightweight Primitive for Online Tuning

Advisors: Magda Balazinska and Alvin Cheung

Abstract:

Modern data processing applications execute increasingly sophisticated analysis that requires operations beyond traditional relational algebra. As a result, operators in query plans grow in diversity and complexity. Designing query optimizer rules and cost models to choose physical operators for all of these novel logical operators is impractical. To address this challenge, we develop Cuttlefish, a new primitive for online query plan tuning that explores the physical operator instances during query execution and exploits the fastest ones, using multi-armed bandit reinforcement learning techniques. We prototype Cuttlefish in Apache Spark and tune operators for image convolution, regular expression matching, and relational joins. Our experiments show Cuttlefish-tuned adaptive convolution and regular expression operators can reach 72-99% of the throughput of an all-knowing oracle that always selects the optimal algorithm, even when individual physical operators are up to 105x slower than the optimal. Additionally, Cuttlefish achieves join throughput improvements of up to 7.5x compared with Spark SQL’s query optimizer.

When: 17 Nov 2017 - 12:00pm Where: CSE 405

## Antoine Kaufmann, General Examination

Title: High-speed network packet processing with NIC-software co-design

Supervisor Committee: Tom Anderson (Chair), Scott Hauck (GSR, EE), Simon Peter (University of Texas- Austin), Xi Wang, Arvind Krishnamurthy, and Michael Taylor

Abstract: Data center applications by design rely heavily on network communication. Network bandwidth in data centers continues to increase, but at the same time processor performance only improves at a slower pace. This puts increasing pressure on software packet processing. Existing approaches such as kernel-bypass and RDMA reduce software processing overheads but trade off policy compliance or flexibility for better performance.

Data center packet processing can be efficient, scalable, policy compliant and flexible. We propose a unique refactoring of packet processing functionality by co-designing processing in the NIC, the OS kernel, and the application. In the common case, the NIC delivers data packets directly to the application, while the OS kernel runs out-of-band management tasks and handles uncommon processing. We present a hardware model for re-configurable NICs, a TCP stack based on our co-design approach, and three case studies of applying co-design directly to data center applications.

When: 6 Nov 2017 - 2:30pm Where: CSE 403

## Pavel Panchekha, General Examination

Title: Automated reasoning about web page layout

Advisors: Michael Ernst and Zachary Tatlock

Supervisory Committee: Michael Ernst (Co-Chair), Zachary Tatlock (Co-Chair), Andy Ko (GSR, iSchool), Shoaib Kamil (Adobe Research), and James Fogarty

Abstract: Web pages are a common way for people to interact with applications critically important to their health, education, and livelihood. Web pages must correspondingly be accessible to all and easily usable. However, the technologies that web pages are developed in, chiefly HTML and CSS, are complex, with many unusual features and special cases. This talk discusses Cassius, a new attempt to make it easier to test and verify web page accessibility and usability. Cassius is a mathematical description of the web page layout algorithm used by web browsers, and an encoding of this specification to Satisfiability Modulo Theories, a format that allows efficient automated reasoning about web pages. Cassius has been used to build VizAssert, a tool to verify accessibility and usability assertions of a web page over all possible user configurations. The talk will discuss Cassius, VizAssert, and chart out ongoing work to make Cassius modular, allowing efficient reasoning about large web pages and web page components.

When: 3 Nov 2017 - 9:30am Where: CSE 303

## John Thickstun, Qualifying Project Presentation

Title: Supervising Music Transcription

Abstract: Music transcription can be viewed as a multi-label classification problem, in which we identify notes present in an audio recording at time t based on a two-sided contextual window of audio surrounding t. We introduce a large-scale dataset, MusicNet, consisting of music recordings and labels suitable to supervising transcription and other learning tasks. We will discuss the construction of the dataset, including optimal-alignment protocols and the value of side-information. We will then turn our attention to network architectures and data augmentations that lead to state-of-the-art performance for music transcription. Along the way, we will consider several scientific questions: what are the low level features of musical audio? What are the invariances of these recordings?

When: 2 Nov 2017 - 1:30pm Where: CSE 303

## Xuan Luo, Qualifying Project Presentation

Title: Pepper's Cone: An Inexpensive Do-It-Yourself 3D Display

Abstract: In this talk, I will introduce a simple 3D display that can be built from a tablet computer and a plastic sheet folded into a cone. This display allows viewing a three-dimensional object from any direction over a 360-degree path of travel without the use of special glasses. Inspired by the classic Pepper's Ghost illusion, our approach uses a curved transparent surface to reflect the image displayed on a 2D display. By properly pre-distorting the displayed image our system can produce a perspective-correct image to the viewer that appears to be suspended inside the reflector. We use the gyroscope integrated into modern tablets to adjust the rendered image based on the relative orientation of the viewer. Our particular reflector geometry was determined by analyzing the optical performance and stereo-compatibility of a space of rotationally-symmetric conic surfaces. We present several prototypes along with side-by-side comparisons with reference images.

When: 31 Oct 2017 - 4:00pm Where: CSE 624

## Sam Elliott, Qualifying Project Presentation

Title: Putting the Checks into Checked C

Advisors: Ras Bodik and Dan Grossman

Abstract:

Checked C is an extension to C that aims to provide a route for
programmers to upgrade their existing C programs to a safer language
without losing the low-level control they enjoy. The first origin of
unsafe code that Checked C is addressing is spatial memory safety,
such as buffer overruns and out-of-bounds memory accesses.

and array types to C, and a method for annotating pointers with
expressions denoting the bounds of the objects they reference in
memory. To ensure memory safety, the Checked C compiler uses a mixture
of static and dynamic checks over these additions to the C language.

This talk concerns these Dynamic Checks, and the
algorithms and infrastructure required to support them, including: the
soundness property Checked C is aiming to preserve, propagation rules
for bounds information, and the code generation algorithm for the
checks themselves. This talk includes an evaluation of these dynamic
bounds checks, their overhead, and their interaction with a
state-of-the-art optimizer.

When: 31 Oct 2017 - 9:00am Where: CSE 674

## Minjoon Seo, General Examination

Supervisory Committee: Ali Farhadi (Co-Chair), Hannaneh Hajishrizi (Co-Chair, EE), Gina-Anne Levow (GSR, Linguistics), and Oren Etzioni

Abstract: Reasoning can be defined as an ability to produce new knowledge from existing knowledge. While reasoning in a formally defined (logical) environment is well-studied, constructing the ontology and annotating training data for grounding surface forms (natural language) require tremendous amount of effort from human experts, still without guarantee of success. In this talk, I will discuss two recent projects on learning to (1) read and comprehend (ground) natural language and (2) reason over multiple facts in a synthetic setup, both with end-to-end training and minimal supervision of question answers. Then I will demonstrate that while language understanding and reasoning could be tackled one by one, there is a clear challenge in learning both at the same time. I will finally end my talk with my thesis proposal for immediate next steps to address the challenge and contribute towards building a general-purpose reasoning model.

When: 17 Oct 2017 - 3:30pm Where: CSE 403

## Xin Ru Wang, General Examination

Title: Multimodal machine learning techniques for naturalistic unlabelled long-term neural, audio and video recordings

Advisors: Rajesh Rao and Bingni Brunton

Supervisory Committee: Rajesh Rao (Co-Chair), Bingni Brunton (Co-Chair), Katherine Steele (GSR, ME), and Ali Farhadi

Abstract

In our data-rich world today, the need for better capture and storage of large everyday data are spurring the development of many techniques. However, methods for analysis are still often only trained and tested on small sets of well-curated and labelled samples. To truly garner the potential of the huge amounts of real-world data being recorded every second, we must develop techniques that can not only interpret the signals in lab-like conditions but remain robust in natural circumstances. Data recorded from and for bioelectronic technologies are especially noisy, such as ones for developing useful interfaces between brains and machines (BCI). My research develops and integrates multimodal machine learning techniques to tackle these large, noisy and unlabelled recordings. In my presentation, I will explore the connections between the fields of computer vision, speech recognition, machine learning and neural decoding. I will also summarize my previous work in applying deep learning to a long-term naturalistic multimodal dataset with neural, video and audio recordings to predict future human movements. The proposed work will regress as well as detect future movements, predict natural speech and attempt to map the human brain with semantic associations. As well, I propose to share this rare dataset with the wider scientific community.

When: 4 Oct 2017 - 11:00am Where: CSE 678

## Robbie Weber, Qualifying Project Presentation

Title: A Simply Exponential Upper Bound on the Number of Stable Matchings

Advisors: Anna Karlin and Shayan Oveis Gharan

Abstract: For a stable matching instance with n men and n women let f(n) denote the maximum number of stable matchings. It has been a long-standing open problem to understand the asymptotic behavior of f(n) as n goes to infinity. The problem was first posed by Knuth in 1970s. Until now the best lower bound was approximately 2.28^n, and the best upper bound was n!/(c')^n, for some constant c'. Here, we show that for any n, f(n)  c^n for some universal constant c>1. This matches the lower bound up to the base of the exponent. Our proof is based on a reduction to counting the number of downsets of a family of posets that we call mixing. The latter could be of independent interest.

When: 22 Sep 2017 - 12:00pm Where: CSE 303

Title: Interactive In-Situ Scene Capture on Mobile Devices

Supervisory Committee: Steve Seitz (Chair), Alex Anderson (GSR, Architecture), Brian Curless, and Michael Cohen

Abstract: Architectural visualizations of indoor scenes enable compelling applications in several areas, such as real estate, interior design, cultural heritage preservation, and more recently, immersive Virtual Reality. Computer-Aided Design (CAD) tools have been invaluable for creating such visualizations. However, creating detailed, attractive visualizations of scenes remains a challenging task, particularly for non-experts. User interfaces for CAD tools tend to be complex and require significant manual effort to operate. These tools are also designed to be used ex-situ, or off-site, making it difficult to record and reproduce details faithfully.

In this thesis, I propose novel techniques and systems for interactive in-situ scene capture on mobile devices, that lets non-expert users quickly and easily capture useful architectural visualizations of indoor scenes. These systems are built upon two key insights; 1) sensors on mobile devices can be leveraged to capture important aspects of the scene such as dimensions, room shape, furniture placement, etc., and 2) an in-situ user can assist in the modeling task by acting as a guide for reconstruction and object recognition algorithms. Based on these insights, the semi-automatic systems that I propose combine the strengths of the user, who is good at high-level semantic reasoning, and the computer, which excels at combinatorics and numerical optimization.

In this talk, I will present the design, implementation and evaluation of three systems for in-situ indoor scene capture on mobile devices.

When: 7 Sep 2017 - 11:30am Where: CSE 305

## Irene Zhang, Final Examination

Title: Distributed Operating Systems for Mobile/Cloud Applications

Supervisory Committee: Hank Levy (Chair), Scott Hauck (GSR, EE), Arvind Krishnamurthy, and Luis Ceze

Abstract: The convergence of ubiquitous mobile devices, large-scale cloud platforms and pervasive network connectivity have changed the face of modern user applications. Unlike a traditional desktop application of the past, which runs on a single machine and supports a single user, the typical user-facing application today spans numerous mobile devices and cloud servers while supporting large numbers of users. This shift has significantly increased the difficulty of building user applications due to the challenges of the mobile/cloud environment (e.g., partial failures, network partitions) and the new requirements of mobile/cloud applications (e.g., availability, scalability).  At the same time, existing systems do little to help application programmers cope with these challenges and requirements.

This thesis proposes a new type of mobile/cloud operating system designed to meet the needs of modern applications. Mobile/cloud applications are the standard application of the future; thus, they deserve a first-class operating system to simplify their development and management. This thesis includes three systems that make up the world's first mobile/cloud operating system: (1) Sapphire, a new distributed runtime and process management system, (2) Diamond, a new distributed memory management system, and (3) TAPIR, a new distributed storage system. Together, these systems introduce several new operating systems abstractions and mechanisms designed to eliminate the challenges and simplify the requirements of mobile/cloud applications. This thesis demonstrates that, like operating systems of the past, these systems make it easier for application programmers to build larger and more complex applications.

When: 1 Sep 2017 - 10:00am Where: CSE 303

Title: Semantic 3D Scene Understanding for Virtual and Augmented Reality

Supervisory Committee: Steve Seitz (Chair), Daniela Rosner (GSR, HCDE), Brian Curless, Alexei Efros, and Sergey Levine

Abstract: Semantic 3D scene understanding is essential in applications that incorporate interaction with the 3D world such as Virtual and Augmented Reality (VR/AR). In my talk, I will explore several prior approaches on semantic 3D scene understanding for VR/AR applications and propose a 3D scene CAD model reconstruction from single image. Given a single photo of a room and a large database of object CAD models, our goal is to reconstruct a scene that is as similar as possible to the scene depicted in the image, and composed of objects drawn from 3D shape database. My proposed work is a completely automatic system to address this IM2CAD problem that produces high quality results on challenging indoor scenes by iteratively optimizing the placement and scale of objects to best match scene renderings in simulation to the input image, using image comparison metrics trained via deep convolutional neural nets.  We also show the applicability of our method in standard scene understanding benchmarks where we obtain significant improvement.

When: 15 Aug 2017 - 2:30pm Where: CSE 303

## Raymond Cheng, Final Examination

Title: Practical Improvements to User Privacy in Cloud Applications

Advisors: Tom Anderson and Arvind Krishnamurthy

Supervisory Committee: Tom Anderson (Co-Chair), Arvind Krishnamurthy (Co-Chair), Raadhakrishnan Poovendran (GSR, EE), Yoshi Kohno, and Jon Howell (MSR)

Abstract: As the cloud handles more user data, users need better techniques to protect their privacy from adversaries looking to gain unauthorized access to sensitive data. Today’s cloud services offer weak assurances with respect to user privacy, as most data is processed unencrypted in a centralized location by systems with a large trusted computing base. While current architectures enable application development speed, this comes at the cost of susceptibility to large-scale data breaches.

In this thesis, I argue that we can make significant improvements to user privacy from both external attackers and insider threats. In the first part of the thesis, I develop the Radiatus architecture for securing fully-featured cloud applications from external attacks. Radiatus secures private data stored by web applications by isolating server-side code execution into per-user sandboxes, limiting the scope of successful attacks. In the second part of the thesis, I focus on a simpler messaging application, Talek, securing it from both external and insider threats. Talek is a group private messaging system that hides both message contents as well as communication patterns from an adversary in partial control of the cloud.

Both of these systems are designed to provide better security and privacy guarantees for users under realistic threat models, while offering practical performance and development costs. This thesis presents an implementation and evaluation of both systems, showing that improved user privacy can come at acceptable costs.

When: 10 Aug 2017 - 2:00pm Where: CSE 305

## Sam Sudar, Final Examination

Title: Leveraging the Mobile Web in Resource-Constrained Environments

Supervisory Committee: Richard Anderson (Chair), Ryan Calo (GSR, School of Law), Franzi Roesner, and Matt Welsh (Google)

Abstract: Mobile phones and the internet are powerful technologies affecting many areas of society. Most advances in these areas disproportionately benefit the wealthy and well-connected. My work has explored ways that mobile technology and the web can be used in resource-constrained settings, and specifically in the developing world. This talk will describe a framework for simplifying mobile workflows and for sharing web content on local networks.

When: 10 Aug 2017 - 10:00am Where: CSE 303

## Mark Yatskar, Final Examination

Title: Natural Language as a Scaffold for Visual Recognition

Supervisory Committee: Luke Zettlemoyer (Chair), Mari Ostendorf (GSR, EE), Yejin Choi, and Ali Farhadi

Abstract: A goal of artificial intelligence is to create a system that can perceive and understand the visual world through images. Central to this goal is defining what exactly should be recognized, both in structure and coverage. Numerous competencies have been proposed, ranging from low level tasks such as edge detection to high level tasks, such as semantic segmentation. In each case, a specific set of visual targets is considered (e.g. particular objects or activities to be recognized) and it can be difficult to define a comprehensive set of everything that could be present in the images. In contrast to these efforts, we consider taking a broader view of visual recognition. We propose to use natural language as a guide for what people can perceive about the world from images and what ultimately machines should emulate.

We show it is possible to use unrestricted words and large natural language processing ontologies to define relatively complete sets of targets for visual recognition. We explore several core questions centered around this theme: (a) what kind of language can be used, (b) what it means to label everything and (c) can structure in language be used to define a recognition problem. We make progress in several directions, showing for example that highly ambiguous sentimental language can be used to formulate concrete targets and that linguistics feature norms can be used to densely annotate many complex aspects of images.

Finally, this thesis introduces situation recognition, a novel representation of events in images that relies on two natural language processing resources to achieve scale and expressivity. The formalism combines WordNet, an ontology of nouns, with FrameNet, an ontology of verbs and implicit argument types, and is supported by a newly collected large scale image resource imSitu. Situation recognition significantly improves over existing formulations for activities in images, allowing for higher coverage, increased richness of the representation, and more accurate models. We also identify new challenges with our proposal, such as rarely observed target outputs, and develop methods for addressing them.
When: 10 Jul 2017 - 11:00am Where: CSE 203

## Travis Mandel, Final Examination

Title: Better Education Through Improved Reinforcement Learning

Advisors: Zoran Popović and Emma Brunskill (Stanford)

Supervisory Committee: Zoran Popović (co-Chair), Emma Brunskill (co-Chair, Stanford), Mari Ostendorf (GSR, EE), and Dan Weld

Abstract: When a new student comes to play an educational game, how can we determine what content to give them such that they learn as much as possible?  When a frustrated customer calls in to a helpline, how can we determine what to say to best assist them?  When a ill patient comes in to the clinic, how do we determine what tests to run and treatments to give to maximize their quality of life?

These problems, though diverse, are all a seemingly natural choice for reinforcement learning, where an AI agent learns from experience how to make a sequence of decisions to maximize some reward signal.  However, unlike many recent successes of reinforcement learning, in these settings the agent gains experience solely by interacting with humans (e.g. game players or patients). As a result, although the potential to directly impact human lives is much greater, intervening to collect new data is often expensive and potentially risky .  Therefore, in this talk I present methods that allow us to evaluate candidate learning approaches offline using previously-collected data instead of actually deploying them. Further, I present new learning algorithms which ensure that, when we do choose to deploy, the data we gather is maximally useful. Finally, I explore how reinforcement learning agents should best leverage human expertise to gradually extend the capabilities of the system, a topic which lies in the exciting area of Human-in-the-Loop AI.

Throughout the talk I will discuss how I have deployed real-world experiments and used data from thousands of kids to demonstrate the effectiveness of our techniques on several educational games.

When: 5 Jul 2017 - 12:00pm Where: CSE 403

## Supasorn Suwajanakorn, Final Examination

Title: Audiovisual Persona Reconstruction

Advisors: Steven Seitz and Irena Kemelmaher

Supervisor Committee: Steven Seitz (co-Chair), Irena Kemelmaher (co-Chair), Duane Storti (GSR, ME), and Richard Szeliski (Facebook)

Abstract: How can Tom Hanks come across so differently in "Forrest Gump" and "Catch Me If You Can?" What makes him unique in each of his roles? Is it his appearance? The way he talks? The way he moves? In my thesis, I introduce the problem of persona reconstruction. I define it as a modeling process that accurately represents the likeness of a person, and propose solutions to address the problem with the goal of creating a model that looks, talks, and acts like the recorded person. The specific aspects of persona modelled in this thesis include facial shape and appearance, motion and expression dynamics, the aging process, the speaking style and how a person talks through solving the visual speech synthesis problem.

These goals are highly ambitious. The key idea of this thesis is that the utilization of a large amount of unconstrained data enables overcoming many of the challenges. Unlike most traditional modeling techniques which require a sophisticated capturing process, my solutions to these tasks operate only on unconstrained data such as a uncalibrated personal photo and video collection, and thus can be scaled to virtually anyone, even historical figures, with minimal efforts.

In particular, I first propose new techniques to reconstruct time-varying facial geometry equipped with expression-dependent texture that captures even minute shape variations such as wrinkles and creases using a combination of uncalibrated photometric stereo, novel 3D optical flow, dense pixel-level face alignment, and frequency-based image blending. Then I demonstrate a way to drive or animate the reconstructed model with a source video of another actor by transferring the expression dynamics while preserving the likeness of the person. Together these techniques represent the first system that allows reconstruction of a controllable 3D model of any person from just a photo collection.

Next, I model facial changes due to aging by learning the aging transformation from unstructured Internet photos using a novel illumination-subspace matching technique. Then I apply such a transformation in an application that takes as input a photograph of a child and produces a series of age-progressed outputs between 1 and 80 years of age. The proposed technique establishes a new state of the art for the most difficult aging case of babies to adults. This is demonstrated by an extensive evaluation of age progression techniques in the literature.

Finally, I model how a person talks via a system that can synthesize a realistic video of a person speaking given just an input audio. Unlike prior work which requires a carefully constructed speech database from many individuals, my solution solves the video speech problem by requiring only existing video footage of a single person. Specifically, it focuses on a single person (Barack Obama) and relies on an LSTM-based recurrent neural network trained on Obama's footage to synthesize a high-quality mouth video of him speaking. My approach generates compelling and believable videos from audio that enable a range of important applications such as lip-reading for hearing-impaired people, video bandwidth reduction, and creating digital humans which are central to entertainment applications like special effects in films.

When: 20 Jun 2017 - 12:00pm Where: CSE 203

## Hanchuan Li, General Examination

Title: Enabling Novel Sensing and Interaction with Everyday Objects using Commercial RFID Systems

Supervisory Committee: Shwetak Patel (Chair), Joan Sanders (GSR, Bioengineering), Josh Smith, and Alanson Sample (Disney Research)

Abstract: The Internet of Things (IoT) promises an interconnected network of smart devices that will revolutionize the way people interact with their surrounding environments. This distributed network of physical devices will open up tremendous opportunities for Human-Computer Interaction and Ubiquitous Computing, creating novel user-centered and context-aware sensing applications.

The advancement of IoT has been heavily focused on creating new and smart electronic devices, while the vast majority of everyday non-smart objects are left unchecked. There currently exists a huge gap between the collection of devices integrated to the IoT and the remaining massive number of everyday objects that people interact with in their daily living.

Radio-frequency Identification (RFID) has been widely adopted in the IoT industry as a standardized infrastructure. In this thesis proposal, I apply signal processing and machine learning techniques to low-level channel parameters of commercially available RFID tags to enable novel sensing and interaction with everyday objects.These new sensing capabilities allow for our system to recognize fine-grain daily activities, create tangible passive input devices, and enhance user experiences in human-robot interactions.

When: 13 Jun 2017 - 12:30pm Where: CSE 305

Title: Measuring and Improving Security and Privacy on the Web: Case Studies with QR Codes, Third-Party Tracking, and Archives

Advisors: Yoshi Kohno and Franzi Roesner

Supervisory Committee: Yoshi Kohno (co-Chair), Franzi Roesner (co-Chair), David McDonald (GSR, HCDE), and Arvind Krishnamurthy

Abstract: The web is deeply integrated into the economic, governmental, and social fabrics of our society, and it promises to help fulfill people’s diverse and critical needs easier, cheaper, faster, and more democratically. However, to fulfill these promises, we will need to study the security and privacy implications of the web and its surrounding technologies, and ensure that security and privacy are incorporated into their future evolution. I will present measurement and analysis work from my PhD examining the web and a set of the web’s sister technologies, forming insights and foundations for the ways that we can make the web more secure and privacy preserving. I identify and quantify current, past, and future properties of the web which are critical to its ability to provide security and privacy to its users, and synthesize lessons from these measurements and analyses which aim to guide the future designers and regulators of technologies surrounding the web in ensuring that it serves the needs of all who use it. Specifically, I will present longitudinal measurements of third-party web tracking using a novel measurement technique, as well as analyses of the security of web archives against malicious manipulation and of the global use of QR codes.

When: 9 Jun 2017 - 2:00pm Where: CSE 203

## Eunsol Choi, General Examination

Title: Extracting, Inferring and Applying Knowledge about Entities

Advisors: Luke Zettlemoyer and Yejin Choi

Supervisory Committee: Luke Zettlemoyer (co-Chair), Yejin Choi (co-Chair), Emily M. Bender (GSR, Linguistics), and Dan Weld

Abstract: Real world entities such as people, organizations and countries play a critical role in text. Language contains rich explicit and implicit information about these entities, such as the categories they belong to, relationships to other entities, and events they participate in. To extract such knowledge, models must recognize entities from text and build representation for entities. However, this is challenging: even in a single document, different expressions (i.e., Google, the search giant, the company, it) point to the same entity and often knowledge can only be inferred from context without being explicitly stated.

In this work, we focus on entities to interpret text. Specifically, we introduce two approaches for extracting and inferring knowledge about entities. The first work focuses on entity-entity sentiment relationship, i.e., who feels positively (or negatively) towards whom, a task which requires reasoning about social dynamics of entities as well as targeted sentiment expressions. The second work infers free-form phrases that describe appropriate types for the target entity. For example, consider the sentences Bill robbed John. He was arrested." The noun phrases John" and he" have specific types, such as criminal", that can be inferred from context. For this prediction task, we introduce two novel sources of distant supervision: head words of noun phrases found with an automated parser, and types heuristically extracted from Wikipedia pages of linked named entities.

As a future direction, we propose to apply knowledge about entities to improve core NLP tasks such as coreference resolution and question answering. Lastly, we also propose to create summarizations focused on specific entities and examine how entity types affect the way media describes events they participate in.

When: 9 Jun 2017 - 1:00pm Where: CSE 303

## Kenton Lee, General Examination

Title: End-to-end Neural Structured Prediction

Supervisor Committee: Luke Zettlemoyer (Chair), Emily Bender (GSR, Linguistics), Yejin Choi, and Noah Smith

Abstract: Many core challenges in natural language understanding are formulated as structured prediction problems, and recently introduced neural modeling techniques have significantly improved performance in nearly every case.  However, these techniques can be challenging to develop, often requiring a trade-off between tractable inference and the use of neural representations for full output structures. We outline two approaches that make different trade-offs, as demonstrated by models for two different tasks. In CCG parsing, an A* parser improves accuracy by recursively modeling entire parse trees while maintaining optimality guarantees. In coreference resolution, where the model must scale to large documents, simpler inference is critical. Instead, an end-to-end span-level model that does not explicitly model clusters can produce state-of-the-art results.

When: 8 Jun 2017 - 3:30pm Where: CSE 503

## Catherine Baker, Final Examination

Title: Understanding and Improving the Educational Experiences of Blind Programmers

Supervisory Committee: Richard Ladner (Chair), Jennifer Turns (GSR, HCDE), Katharina Reinecke, Alan Borning, and Andy Ko (iSchool)

Abstract: Teaching people with disabilities tech skills empowers them to create solutions to problems they encounter. However, computer science is taught in a highly visual manner which can present barriers for people who are blind. The goal of this dissertation is to understand what those barriers are and to present the projects I have done to decrease those barriers.

The first projects I present look at the barriers that blind students face. I first present the results of my survey and interviews with blind graduates with degrees in computer science or related fields. This work highlights the many barriers that these blind students overcame when they were in college. We then follow-up on one of the barriers mentioned, access to technology, by doing a preliminary accessibility evaluation of six popular integrated development environments (IDEs) and code editors. I found that half were unusable and all have some inaccessible portions.

As access to visual information is one of the barriers in computer science education, I present three projects I have done to increase access to visual information in computer science. The first project is Tactile Graphics with a Voice (TGV). This project investigated an alternative to Braille labels for those that don’t know Braille and showed that TGV was a viable solution. The next project was StructJumper, which creates a modified abstract syntax tree which blind programmers can use to navigate through code. The results of the evaluation show that users could navigate quicker and easily determine the relationships of lines of code. The final project I present is a dynamic graph tool which has two different modes for handling focus changes when moving between graphs. I found that users can use both modes to answer questions about changes in graphs and want access to both modes plus others to be able to select the appropriate mode for the task.

These projects work towards the goal of making computer science education more accessible to blind students. By identifying the barriers that exist and creating solutions to overcome them, we can increase the number of blind students in computer science.

When: 8 Jun 2017 - 10:00am Where: CSE 305

## Shu Liang, General Examination

Title: Virtualize me: Personalized 3D Head Reconstruction

Advisors: Linda Shapiro and Ira Kemelmacher

Supervisor Committee: Linda Shapiro (co-Chair), Irena Kemelmacher (co-Chair), John Kramlich (GSR, ME), and Brian Curless

Abstract: Personalized 3D face reconstruction has produced exciting results over the past few years. However, most methods focus solely on the face area and mask out the rest of the head. In this work, we propose a data-driven approach to obtaining a personalized reconstruction of a subject’s full head in the wild. We start with the visual hull shape gotten from a person’s Internet video and fill in the face details using the shading information extracted from the same person’s photo collections. The hairstyle is retrieved from a synthetic hairstyle dataset and deformed to fit the person’s rough hair shape. We will demonstrate the effectiveness of this algorithm on different celebrities using their photos and videos downloaded from the Internet.

When: 7 Jun 2017 - 2:00pm Where: CSE 203

## James Ferguson, Qualifying Project Presentation

Title: Semi-Supervised Event Extraction with Paraphrase Clusters

Advisors: Hannaneh Hajishirzi and Dan Weld

Abstract: Supervised event extraction systems are limited in their accuracy due to the lack of available training data. We present a method for self-training event extraction systems by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from various sources. If our system can make a high-confidence extraction of some mentions in a cluster of news articles, it can then leverage those mentions to identify additional occurrences of the event in the cluster. This allows us to collect a large amount of new, diverse training data for a specific event. Using this method, our experiments show an increase of 1.7 F1 points for a near-state-of-the-art baseline extractor on the ACE 2005 dataset.

When: 6 Jun 2017 - 2:00pm Where: CSE 303

## Kanit Wongsuphasawat, General Examination

Title:  Mixed-Initiative Visualization Tools  for Exploratory Data Analysis

Supervisory Committee: Jeffrey Heer (Chair), Jevin West (GSR, iSchool), Bill Howe, and Jock Mackinlay (Tableau)

Abstract: Visual data analysis is an iterative process that involves both open-ended exploration and focused question answering. However, traditional visual analysis tools provide specification interfaces that are suitable only for question answering but are often tedious for exploration. This doctoral thesis investigates the design of visualization tools that augments traditional specification interfaces with automated chart specification to facilitate exploratory analysis. First, we present the design and controlled studies of Voyager, a system that couples faceted browsing with visualization recommendations to promote open-ended exploration. To facilitate both exploration and question answering, we present Voyager 2, a visual analysis tool that blends manual and automated specification in a unified system. To support specifications and recommendations in Voyager and Voyager 2, we also present the Vega-Lite visualization grammar and the CompassQL visualization query language, which defines an organized collection of charts. In current work, we plan to extend our interfaces and query engine to facilitate exploratory analysis of datasets with a larger number of fields.

When: 2 Jun 2017 - 1:30pm Where: CSE 503

## Luheng He, General Examination

Title: Annotating, Learning, and Representing Shallow Semantic Structure

Supervisory Committee: Luke Zettlemoyer (Chair), Gina-Anne Levow (GSR, Linguistics), Yejin Choi, and Noah Smith

Abstract: One key challenge to understanding human language is to find out the word to word semantic relations, such as “who does what to whom”, “when”, and “where”.  Semantic role labeling (SRL) is the widely studied challenge of recovering such predicate-argument structure, typically for verbs. SRL is designed to be consistent across syntactic annotation and to some extent, language independent, which can potentially benefit downstream applications such as natural language inference, machine translation and summarization. However, the performance of SRL system is limited by the amount of training data, further impeding its usage in downstream applications. My generals work is three folds: 1) collecting more SRL data using natural language driven QA annotation, 2) using end-to-end neural models to accurately predict SRL, and 3) proposing a unified framework for predicting and representing SRL structure to improve performance on downstream applications.

When: 2 Jun 2017 - 10:00am Where: CSE 678

## Christopher Lin, Final Examination

Title: The Intelligent Management of Crowd-Powered Machine Learning

Supervisory Committee: Dan Weld (co-Chair), Mausam (co-Chair), Thomas Richardson (GSR, STAT), Eric Horvitz (MSR), and Carlos Guestrin

Abstract: Artificial intelligence and machine learning powers many technologies today, from spam filters to self-driving cars to medical decision assistants. While this revolution has hugely benefited from developments in core areas like deep learning, it also could not have occurred without data, which nowadays is frequently procured at massive scale from crowds. Because data is so crucial, a key next step towards truly autonomous agents is the design of better methods for intelligently managing the now-ubiquitous crowd-powered data-gathering process.

This dissertation takes this key next step by developing algorithms for the online and dynamic control of data acquisition. We consider how to gather data for its two primary and independent purposes: training and evaluation.

In the first part of the dissertation, we develop algorithms for obtaining data for testing. The most important requirement of testing data is that it must be extremely clean. Thus to deal with noisy human annotations, machine learning practitioners typically rely on careful workflow design and advanced statistical techniques for label aggregation. A common process involves designing and testing multiple crowdsourcing workflows for their tasks, identifying the single best-performing workflow, and then aggregating worker responses from redundant runs of that single workflow. We improve upon this process in two ways: we build a control models that allow for switching between many workflows depending on how well a particular workflow is performing for a given example and worker; and we build a control model that can aggregate labels from tasks that do not have a finite predefined set of multiple choice answers (\eg\ counting tasks.)
We then implement agents that use our new models to dynamically choose whether to acquire more labels from the crowd or stop, and show that they can produce higher quality labels at a cheaper cost than the state-of-the-art baselines.

In the second part of the dissertation, we shift to tackle the second purpose of data: training.
Because learning algorithms are often robust to noise, instead of optimizing for accuracy like test sets, training sets can make tradeoffs between quantity, accuracy, and diversity. We first investigate the tradeoff between quantity and accuracy, in which given a fixed budget, one can spend money on acquiring cleaner labels, or one can spend money acquiring more examples. We survey how inductive bias, worker accuracy, and budget affect whether a larger and noisier training set or a smaller and cleaner one will train better classifiers. We then set up a formal framework for dynamically choosing the next example to label or relabel by generalizing active learning to allow for relabeling, which we call re-active learning, and we design new algorithms for re-active learning that outperform active learning baselines. Finally, we consider the effect of domain skew on strategies for increasing label diversity in a training set. We introduce an algorithm that dynamically switches between crowd generation and crowd labeling and show that it achieves better performance than state-of-the-art baselines across several domains with varying skew.

When: 2 Jun 2017 - 10:00am Where: CSE 303

## Alex Polozov, Final Examination

Title: A Framework for Mass-Market Inductive Program Synthesis

Advisors: Zoran Popovic and Sumit Gulwani

Supervisor Committee: Zoran Popovic (co-Chair), Sumit Gulwani (co-Chair), Gary Hsieh (GSR, HCDE), Emina Torlak, and Ras Bodik

Abstract: Programming by examples (PBE), or inductive program synthesis, is a problem of finding a program in the underlying domain-specific language (DSL) that is consistent with the given input-output examples or constraints. In the last decade, it has gained a lot of prominence thanks to the mass-market deployments of several PBE-based technologies for data wrangling — the widespread problem of transforming raw datasets into a structured form, more amenable to analysis.

However, deployment of a mass-market application of program synthesis is challenging. First, an efficient implementation requires non-trivial engineering insight, often overlooked in a research prototype. This insight takes the form of domain-specific knowledge and heuristics, which complicate the implementation, extensibility, and maintenance of the underlying synthesis algorithm. Second, application development should be fast and agile, tailoring to versatile market requirements. Third, the underlying synthesis algorithm should be accessible to the engineers responsible for product maintenance.

In this work, I show how to generalize the ideas of 10+ previous specialized inductive synthesizers into a single framework, which facilitates automatic generation of a domain-specific synthesizer from the mere definition of the corresponding DSL and its properties. PROSE (PROgram Synthesis using Examples) is the first program synthesis framework that explicitly separates domain-agnostic search algorithms from domain-specific expert insight, making the resulting synthesizer both fast and accessible. The underlying synthesis algorithm pioneers the use of deductive reasoning for designer-defined domain-specific operators and languages, which enables mean synthesis times of 1-3 sec on real-life datasets.

Over the last two years, a dedicated team at Microsoft has built and deployed 10+ technologies on top of the PROSE framework. Using them as case studies, I examine the user interaction challenges that arise after a mass-market deployment of a PBE-powered application. In this talk, I'll show how expressing program synthesis as an interactive problem facilitates user intent disambiguation, incremental learning from additional examples, and increases the users' confidence in the system.

When: 2 Jun 2017 - 9:00am Where: CSE 503

## Paul Vines, Final Examination

Title: Surveillance Self-Defense - from Solutions to New Problems

Advisors: Yoshi Kohno and Franzi Roesner

Supervisory Committee: Yoshi Kohno (co-Chair), Franzi Roesner (co-Chair), Emma Spiro (GSR, iSchool), and Zach Tatlock

Abstract: The world is increasingly connected; a side effect of this digital connectivity is that surveillance has become easier and more constant than ever. Individuals can be surveilled communicating with friends or colleagues, tracked as they visit websites, and advertised to based on an amazing variety of fine-grained data.

In this work, I describe a series of forays into research into modern surveillance and user self-defense against it:

(1) A covert communication system using online games that allows individuals to hold a remote conversation without revealing contents or even the existence of the conversation. Our system withstand state-of-the-art attacks on covert channels.

(2) An evaluation of how current privacy defenses on the web actually perform in preventing web-tracking, including analysis of a wide range of privacy defenses from multiple types of data.

(3) An exploration of a new kind of surveillance: intelligence gathering using online advertising - or ADINT. Online advertisers are in a privileged network position, able to collect, store, and analyze data on billions of people: we investigate how this can be used to perform fine-grained location tracking and more.

When: 31 May 2017 - 3:15pm Where: CSE 303

## James Bornholt, General Examination

Title: Scaling the Symbolic Reasoning Stack

Advisors: Emina Torlak, Dan Grossman, and Luis Ceze

Supervisory Committee: Emina Torlak (co-Chair), Dan Grossman (co-Chair), Luis Ceze (co-Chair), Andy Ko (GSR, iSchool), Xi Wang, and Ras Bodik

Abstract: Symbolic reasoning tools, including those based on program synthesis, have successfully solved challenging tasks in a variety of application domains, from compilers to consistency models. But developing a symbolic reasoning tool that scales to large problems requires considerable expertise—implementing search procedures from scratch and a thorough understanding of the underlying engine—that limits the reach of synthesis technology. My thesis is that new abstractions and interfaces can help programmers to build scalable symbolic reasoning tools. I propose three contributions to scale the symbolic reasoning stack: metasketches to specify search strategies to a reusable synthesis engine, symbolic profiling to understand and optimize the behavior of a symbolic tool, and profile-guided symbolic execution to automatically optimize compilation to constraints. I have demonstrated the effectiveness of these contributions in several domains, including a new tool for synthesizing memory consistency models.

When: 31 May 2017 - 2:00pm Where: CSE 305

## Eric Zeng, Qualifying Project Presentation

Title: End User Security and Privacy Concerns with Smart Homes

Advisors: Franzi Roesner and Yoshi Kohno

Abstract: The Internet of Things (IoT) is becoming increasingly widespread in home environments. Consumers are transforming their homes into smart homes, with internet-connected sensors, lights, appliances and locks, controlled by voice or other user-defined automations. Security experts have identified numerous concerns with IoT and smart homes, including vulnerable and unreliable devices, and the privacy risks of internet-connected sensors in the home. These concerns are supported by recent high profile attacks, such as the Mirai DDoS attacks.

However, little work has studied the security and privacy concerns of end users who live with smart home systems. To bridge this gap, we conduct semi-structured interviews with 15 people living in smart homes (12 smart home administrators and 3 other residents) to learn about how they use their smart homes, and understand their security and privacy related attitudes, expectations, and behaviors. Among other findings, we identify gaps in threat models arising from limited technical understanding of smart homes, awareness of some security issues but limited concern, ad hoc mitigation strategies, and a mismatch between the concerns and power of the smart home administrator and other people in the home. From these and other findings, we distill recommendations for the designers of smart home devices and platforms, as well as for end users, and we identify opportunities for future research.

When: 31 May 2017 - 12:00pm Where: CSE 303

## Amanda Swearngin, Qualifying Project Presentation

Title: Genie: Input Retargeting on the Web through Command Reverse Engineering

Advisors: Andy Ko and James Fogarty

Abstract: Most web applications are designed as one-size-fits-all, despite considerable variation in people’s physical abilities, expertise, and other factors that impact interaction. For
example, some web applications require the use of a mouse, precluding use by many people with severe motor disabilities. Others may require laborious manual input that
a skilled developer could automate if the application were scriptable.

In this work, we present Genie, a system that automatically reverse engineers an abstract model of the underlying commands in a web application, then enables
interaction with that functionality through alternative interfaces and other input modalities (e.g., speech, keyboard, or command line input). Genie comprises an abstract model
of command properties, behaviors, and dependencies as well as algorithms that reverse engineer this model from an existing web application through static and dynamic program
analysis. We evaluate Genie by developing several interfaces that automatically add support for speech, keyboard, and command line input to arbitrary web applications.

When: 31 May 2017 - 9:00am Where: CSE 503

## Ezgi Mercan, Final Examination

Title: Digital Pathology: Diagnostic Errors, Viewing Behavior and Image Characteristics

Supervisory Committee: Linda Shapiro (Chair), Mark Ganter (GSR, ME), Su-In Lee, and Joann Elmore (Public Health)

Abstract: Whole slide imaging technologies provide a unique opportunity to collect and analyze large amounts of data on pathologists' interactions with the digital slide. In our work, we are studying the underlying causes of diagnostic errors in histopathology. Instead of focusing on the detection of invasive cancer, we consider the full-spectrum of diagnoses that a pathologist encounters during clinical practice and aim to study misidentification and misinterpretation errors that may cause overdiagnoses or underdiagnoses. To this end, we use the digiPATH dataset that consists of 240 breast biopsies with diagnoses ranging from benign to invasive cancer, the actions of pathologists recorded during their interpretations of the slides and the diagnostic regions associated with the final diagnoses they assigned. Our work consists of three parts: region of interest localization, diagnostic classification and viewing behavior analysis.

In this talk, I will introduce a novel methodology to extract the diagnostically relevant regions of interest from pathologists' viewing behavior, and a computer vision model to detect these regions automatically on unseen images. Then, I will talk about our work on the diagnostic classification of these regions of interest, and describe our features based on tissue label segmentations of the images. Our features include a new kind of image feature defined in for the first time in this work: a sequence of histograms that together comprise the structure feature that can differentiate atypia and DCIS diagnoses more accurately than the pathologists. Finally, I will focus on two pieces of work that analyze the interpretation patterns of the pathologists on the whole slide images: I will first talk about the relationship between the identification of the correct ROI and the diagnosis. Then, I will introduce novel measurements of interpretative patterns and identify two strategies used by the pathologists: scanning and drilling and discuss how these strategies affect the diagnostic decision making process.

When: 26 May 2017 - 10:00am Where: CSE 303

## Alex Mariakakis, General Examination

Title: Using Mobile Devices to Quantify Traditionally Qualitative Health Measures

Advisors: Shwetak Patel and Jacob Wobbrock

Supervisory Committee: Shwetak Patel (co-Chair), Jacob Wobbrock (co-Chair), Katherine Steele (GSR, ME), James Fogarty, and Kurtis Heimerl

Abstract: The path to getting treatment for a medical condition typically begins with an observation made with one of the five human senses, whether by a physician or the patient themselves.

The person may feel warm to the touch, hear a wheezing in their cough, or see that their skin is pale. These observations are often qualitative and subjective, which can lead an untrained person to ignore their own symptoms and neglect treatment until their condition worsens.

Although medical devices exist for quantifying such observations or detecting their underlying causes, they are often not deployed to the general population due to cost and burden of use. For my thesis, I propose that the qualitative health measures convenient for some can be made quantitative for all with little additional burden by examining the ways that medical specialists use their human senses and replicating them through smartphone sensors. My work focuses on how the smartphone camera can be used in place of visual inspection to quantify diagnostic observations of the eye. These projects cover a variety of medical issues, including glaucoma, traumatic brain injuries, and pancreatic cancer. After I describe my projects in detail, I outline the work I expect to complete for my dissertation. Beyond iterating on my current projects, this plan includes interviews where I hope to gather design recommendations from different stakeholders in mobile health (e.g., doctors, physicians, patients, and non-patients) and understand their perspectives on how and when smartphones should be used for diagnosis.

When: 25 May 2017 - 1:00pm Where: CSE 303

## Mandar Joshi, Qualifying Project Presentation

Title: TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

Advisors: Dan Weld and Luke Zettlemoyer

Abstract: We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study.

When: 23 May 2017 - 12:00pm Where: CSE 303

## Philip Cho, Qualifying Project Presentation

Title: Speeding up gradient boosting for training and prediction

Advisors: Carlos Guestrin and Arvind Krishnamurthy

Abstract: Gradient boosting [3] is one of most popular algorithms due to its simplicity and predictive performance. The algorithm produces an ensemble of decision trees, which has many desirable properties as a statistical model. XGBoost [1] is a fast, scalable implementation of gradient boosting that has found a wide following among hobbyists and practitioners. This paper presents our recent contribution for fast training and prediction. Using a series of optimizations such as feature quantization and histogram aggregation of gradients, we accelerate training by 2-4 times without loss of model accuracy. We add support for multiple tree-growing strategies, allowing for potentially unbalanced trees in pursuit of faster loss reduction. In addition, we add a dedicated framework for prediction, in which XGBoost emits generated code that can be used in isolation by user applications. Applications would no longer have to link XGBoost in order to make predictions.

When: 19 May 2017 - 9:00am Where: CSE 303

## Zuoming Shi, Qualifying Project Presentation

Title: The Impact of Optional Game Mechanics and Email Notifications on Player Contributions and Return Rate in Mozak

Advisors: Zoran Popović and Katharina Reinecke

Abstract: We describe the effect of optional tutorial components and email notifications on the retention rate and useful contributions of players in the neuroscience crowd-sourcing game Mozak. We performed between-subject experiments with data collected from 7,000 online players and 1400 registered players, and measured time played, amount of scientific contributions made, and return rate. We found that framing mandatory tutorials as optional game mechanics had positive effects on overall player contribution and return rate, but may decrease time played for new players. Email notifications was found to increase return rate, but not player contributions. Our results suggest that framing tutorials and optional game mechanics can be highly beneficial for online crowd-sourcing games. In addition, our work indicates that single-session play-time is not a reliable metric for evaluating player engagement, as it did not strongly correlate with other metrics previously attributed to player engagement.

When: 18 May 2017 - 3:00pm Where: CSE 403

## Koosha Khalvati, Qualifying Project Presentation

Title: A Probabilistic Model of Social Decision Making based on Reward Maximization

Advisors: Rajesh Rao and Andrea Stocco (I-LABS)

Abstract: A fundamental problem in cognitive neuroscience is how humans make decisions, act, and behave in relation to other humans. Here we adopt the hypothesis that when we are in an interactive social setting, our brains perform Bayesian inference of the intentions and cooperativeness of others using probabilistic representations. We employ the framework of partially observable Markov decision processes (POMDPs) to model human decision making in a social context, focusing specifically on the volunteer’s dilemma in a version of the classic Public Goods Game. We show that the POMDP model explains both the behavior of subjects as well as neural activity recorded using fMRI during the game. The decisions of subjects can be modeled across all trials using two interpretable parameters. Furthermore, the expected reward predicted by the model for each subject was correlated with the activation of brain areas related to reward expectation in social interactions. Our results suggest a probabilistic basis for human social decision making within the framework of expected reward maximization.

When: 18 May 2017 - 2:30pm Where: CSE 203

## Younghoon Kim, Qualifying Project Presentation

Title: GraphScape: A Model for Automated Reasoning about Visualization Similarity and Sequencing

Advisors: Jeffrey Heer and Jessica Hullman

Abstract: We present GraphScape, a directed graph model of the vi- sualization design space that supports automated reasoning about visualization similarity and sequencing. Graph nodes represent grammar-based chart specifications and edges rep- resent edits that transform one chart to another. We weight edges with an estimated cost of the difficulty of interpreting a target visualization given a source visualization. We con- tribute (1) a method for deriving transition costs via a partial ordering of edit operations and the solution of a resulting lin- ear program, and (2) a global weighting term that rewards consistency across transition subsequences. In a controlled experiment, subjects rated visualization sequences covering a taxonomy of common transition types. In all but one case, GraphScape’s highest-ranked suggestion aligns with subjects’ top-rated sequences. Finally, we demonstrate applications of GraphScape to automatically sequence visualization presen- tations, elaborate transition paths between visualizations, and recommend design alternatives (e.g., to improve scalability while minimizing design changes).

When: 17 May 2017 - 3:00pm Where: CSE 305

## Spencer Pearson, Qualifying Project Presentation

Title: Evaluating and Improving Fault Localization Techniques

Advisors: Michael Ernst and Zach Tatlock

Abstract: Fault localization (FL) is the process of identifying which snippets of code need to be changed in order to fix a bug. Automated FL techniques typically instrument the faulty program, run the test suite, and search for statements that are disproportionately important to the failing tests' executions. Many such techniques have been proposed, but evaluations almost invariably focus on artificial faults, created by deliberately inserting faults into correct programs; these are not representative of real faults, and nobody has investigated whether they are good proxies for real faults when evaluating FL techniques.

We evaluated several previously-studied FL techniques on both artificial and real faults, and found that while previous results mostly replicated on artificial faults, most techniques performed almost indistinguishably on real faults. To discover what would make a FL technique do better on real faults, we identified a common structure behind all the techniques we considered, exhaustively analyzed all combinations of the different techniques' parameters, and found the causes of the most common failure modes. From this information, we were able to construct several new techniques that localize faults more effectively than any of the previous FL techniques we studied.

When: 17 May 2017 - 12:30pm Where: CSE 305

## Yue Zhang, Qualifying Project Presentation

Title: Analyzing and Predicting Alternative Splicing With Convolutional Neural Networks

Advisors: Georg Seelig and Larry Ruzzo

Abstract: Alternative splicing is a biological process by which a single mRNA transcript can be spliced into any of several forms, coding for proteins that can have different functions. This process is regulated by many factors, and its disregulation has been implicated in a number of diseases.

We focus on two main problems: discovering sequence motifs that affect alternative splicing, and predicting the effects of mutations on alternative splicing in vivo. Using a synthetic dataset, we trained a variety of neural network models in order to predict the effects of the exonic DNA sequence on splicing. Then, these models were used to derive sequence motifs that contribute to alternative splicing regulation across different cell types. For the prediction step, we tested the predictive power of these models as applied to a dataset of genomic mutations, and compared them with previous models. We will also discuss the construction of a database for alternative splicing events in the genome.

When: 15 May 2017 - 1:00pm Where: CSE 403

## Kaiyuan Zhang, Qualifying Project Presentation

Title: Whole-system security with machine-checked verification

Advisors: Xi Wang, Arvind Krishnamurthy, and Emina Torlak

Abstract: Chimera is a new framework for developing applications with formal, whole-system security guarantees. It employs a systematic approach to decomposing a security property and using a portfolio of theorem provers to discharge the result- ing proof obligations. It allows programmers to specify and verify cryptographic security properties about their applica- tions at the level of assembly code, without trusting the com- piler or runtime; it also allows programmers to invoke auto- mated provers when possible. To further reduce the proof burden, Chimera introduces a security kernel, which allows verified applications to safely reuse unverified components (e.g., off-the-shelf Linux) through sandboxing.

We have implemented Chimera for 64-bit ARM, and used it to build a set of five Chimera applications: a chat client, as well as a notary app, an incrementer app, a password hasher, and a differential privacy service ported from the Ironclad project. Our experience shows that these applications built using Chimera can provide both functional correctness and cryptographic security properties through formal verifica- tion, with a moderate proof effort.

When: 15 May 2017 - 11:00am Where: CSE 503

## Katie Doroschak, Qualifying Project Presentation

Title: Effects of read start bias on discovering allele-specific expression in RNA-seq

Advisors: Larry Ruzzo and Sreeram Kannan (EE)

Abstract: Allele-specific expression (ASE) is a biological phenomenon in which one half of the genome at a particular location is expressed more than the other. Computational methods and data science enable discovery of ASE in RNA-seq data. If an algorithm can computationally identify regions where this is occurring differently in two organisms of the same species, ASE could be used as a marker for a cross between them. Technical artefacts from the sequencing process must also be corrected for computationally. Read start bias is one such artefact that has not been evaluated with respect to ASE. In this work, I identify regions exhibiting ASE and evaluate the effects of read start bias on the ability to discover them. This is achieved with the AlleleSeq pipeline (Rozowsky et al. Molecular systems biology, 2011) with bias correction via SeqBias (Jones et al. Bioinformatics, 2012), and applied to RNA-seq data from an interesting marine organism called Thalassiosira pseudonana. It is theorized that some strains of T. pseudonana have undergone a lapse in part of their reproductive cycle, and I evaluate the potential of using ASE as a marker to study this question.

When: 15 May 2017 - 10:30am Where: CSE 303

## Patrick Lancaster, Qualifying Project Presentation

Title: Improved Object Pose Estimation via Deep Pre-touch Sensing

Advisors: Joshua Smith and Dieter Fox

Abstract: For certain manipulation tasks, pose estimation from head-mounted cameras may not be sufficiently accurate due to the inability to perfectly calibrate the coordinate frames of today's high degree of freedom robot arms that link the head to the end-effectors. We present a novel framework combining pre-touch sensing and deep learning to more accurately estimate pose in an efficient manner. The use of pre-touch sensing allows our method to localize the object directly with respect to the robot's end effector, thereby avoiding error caused by miscalibation of the arms. Instead of requiring the robot to scan the entire object with its pre-touch sensor, we utilize a deep neural network to detect object regions that contain distinctive geometric features. By focusing pre-touch sensing on these regions, the robot can more efficiently gather the information necessary to adjust its original pose estimate. Our region detection network was trained using a new dataset containing objects of widely varying geometries and has been labeled in a scalable fashion that is free from human bias. This dataset is applicable to any task that involves a pre-touch sensor gathering geometric information, and has been made publicly available. We evaluate our framework by having the robot re-estimate the pose of a number of objects of varying geometries. We find that after a sequence of scans, the object can typically be localized to within 0.5 cm of its true position. We also observe that the original pose estimate can often be significantly improved after collecting a single quick scan.

When: 12 May 2017 - 11:00am Where: CSE 303

## Max Forbes, Qualifying Project Presentation

Title: Verb Physics: Relative Physical Knowledge of Actions and Objects

Advisors: Yejin Choi and Noah Smith

Abstract: Learning commonsense knowledge from natural language text is nontrivial due to reporting bias: people rarely state the obvious, e.g., My house is bigger than me.'' However, while rarely stated explicitly, this trivial everyday knowledge does influence the way people talk about the world, which provides indirect clues to reason about the world. For example, a statement like, Tyler entered his house'' implies that his house is bigger than Tyler.

In this paper, we present an approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text. We frame knowledge acquisition as joint inference over two closely related problems: learning (1) relative physical knowledge of object pairs and (2) physical implications of actions when applied to those object pairs. Empirical results demonstrate that it is possible to extract knowledge of actions and objects from language and that joint inference over different types of knowledge improves performance.

When: 12 May 2017 - 11:00am Where: CSE 503

## Christopher Xie, Qualifying Project Presentation

Title: A Unified Framework for Cold Start Time Series Forecasting in the Presence of Missing Data

Abstract: Providing long-range forecasts is a fundamental challenge in time series modeling, which is only compounded by the challenge of having to form such forecasts when a time series has never previously been observed. The latter challenge is the time series version of the cold start problem seen in recommender systems which, to our knowledge, has not been directly addressed in previous work. Time series models are also often plagued by missing data and high-dimensionality (i.e., large collections of observed series), making them ill-suited to the typical structure of big data time series. We provide a unified framework for producing long- range forecasts even when the series has missing values or was previously unobserved; the same framework can be used to impute missing values. Key to the formulation and resulting performance is (1) leveraging repeated patterns over fixed periods of time and across series, and (2) metadata associated with the individual series. We provide an analysis of our framework on web traffic in a Wikipedia dataset.

When: 11 May 2017 - 12:30pm Where: CSE 403

## Eddie Yan, Qualifying Project Presentation

Title: Customizing Progressive JPEG for Efficient Image Storage

Advisors: Luis Ceze and Xi Wang

Abstract: Modern image storage services, especially those associated with social media services, host massive collections of images. These images are often replicated at many different resolutions to support different devices and contexts, incurring substantial capacity overheads. We describe a prototype system design, ShoeBox, to investigate the effectiveness of replacing redundant multi-resolution image storage with progressively encoded JPEG images. After evaluating ShoeBox’s design and performance, we switch to an approach better suited to the capabilities of progressive JPEG which instead resizes images at request time. We describe methods of handling the inefficiencies associated with resizing images on the fly, and propose repurposing the progressive JPEG standard and customizing the organization of image data to reduce the bandwidth overheads of dynamic resizing. We show that at a PSNR of 32 dB, dynamic resizing with progressive JPEG provides 3× read data savings over baseline JPEG, and that progressive JPEG with customized encode parameters can further improve these savings to almost 6×. Additionally, we characterize the decode overheads of progressive JPEG to assess the feasibility of directly decoding progressive JPEG images on energy-limited devices. Our approach does not require modifications to current JPEG software stacks.

When: 10 May 2017 - 1:00pm Where: CSE 615 (Sampa Lab)

## Lucy Simko, Qualifying Project Presentation

Advisors: Yoshi Kohno and Luke Zettlemoyer

Abstract: Source code attribution classifiers have recently become powerful. We consider the possibility that an adversary could craft code with the intention of causing a misclassification, i.e., creating a forgery of another author's programming style in order to hide the forger's own identity or blame the other author. We find that it is possible for a non-expert adversary to defeat such a system. In order to inform the design of adversarially resistant source code attribution classifiers, we conduct two studies with C/C++ programmers to explore the potential tactics and capabilities both of such adversaries and, conversely, of human analysts doing source code authorship attribution. Through the quantitative and qualitative analysis of these studies, we (1) evaluate a state of the art machine classifier against forgeries, (2) evaluate programmers as human analysts/forgery detectors, and (3) compile a set of modifications made to create forgeries. Based on our analyses, we then suggest features that future source code attribution systems might incorporate in order to be adversarially resistant.

When: 10 May 2017 - 12:00pm Where: CSE 503

## Rahul Nadkarni, Qualifying Project Presentation

Title: Sparse plus Low-rank Graphical Models of Time Series for Functional Connectivity in MEG

Advisors: Emily Fox and Rajesh Rao

Abstract: A fundamental problem in neuroscience is inferring the functional connections between brain regions that underlie cognitive behaviors such as vision, speech, and audition. Magnetoencephalography (MEG) has become a popular neuroimaging technique for studying these functional connectivity networks. However, unlike typical approaches for learning these networks from data, we treat the MEG signals as time series rather than independent observations. We represent the functional connectivity network through conditional independence statements between signals on the cortical surface, encoded via a graphical model of time series. Importantly, we extend previous techniques for learning graphs of time series by incorporating a low-rank component, accounting for latent signals that would otherwise lead to inferring spurious connections. We evaluate the model on synthetic data as well as real MEG data collected from an auditory attention task.

When: 8 May 2017 - 11:30am Where: CSE 503

## Emily Furst, Qualifying Project Presentation

Title: Profiling General Purpose GPU Applications

Advisors: Mark Oskin and Bill Howe

Abstract: General Purpose computing on Graphics Processing Units (GPGPU) has become an increasingly popular option for accelerating a wide variety of applications. However, GPUs are not well-suited for all types of applications as data transfer costs can often dominate execution.

We develop a methodology for quantifying how well applications utilize GPU architectures using proprietary profiling tools. By aggregating various profiling metrics, we break down the different aspects that comprise occupancy on the GPU across the runtime of application execution.
We show results for a GPU database implementation, several DOE miniapps, and TensorFlow.
We show that for most of the applications we studied, especially the database and neural net implementations, only a small minority of execution time is spent on the GPU. We further show that even in cases with seemingly good performance, a large portion of the achieved occupancy can actually be attributed to stalls and scalar instructions.

When: 3 May 2017 - 1:00pm Where: CSE 615 (Sampa Lab)

## Jingjing Wang, General Examination

Title: Runtime Optimizations for Large-Scale Data Analytics

Supervisory Committee: Magdalena Balazinska (Chair), Bingni Brunton (GSR, Biology), Dan Suciu, Alvin Cheung, and Arvind Krishnamurthy

Abstract: Large-scale data analytics benefits from optimizations from many aspects. One category of optimizations is hard to be performed statically since they rely on information that is only available during runtime. We show that these runtime optimizations have large impact on application performance. Specifically, we investigate the problem from the following aspects: execution models for iterative queries in shared-nothing architectures, elastic memory management for cloud data analytics, and real-time visual applications. We summarize our work on the first two subjects and propose our plan for the third project.

When: 1 May 2017 - 1:00pm Where: CSE 405

## Danielle Bragg, General Examination

Title: Making Visual Language Accessible: A Crowdsourced Approach

Supervisory Committe: Richard Ladner (Chair), Jaime Snyder (GSR, iSchool), Meredith June Morris, Alan Borning, and Katharina Reinecke

Abstract: This talk presents research to make visual language more accessible to sign language users and low-vision readers. Worldwide, there are about 70 million deaf people using a sign language as their first language, and almost 300 million people with visual impairments. However, much of society and the technical world communicates through written text. This excludes many people from full participation, as sign languages lack a standard written form, and low-vision readers struggle to access text visually.

In this talk, I will demonstrate how to design, build, and evaluate systems that include deaf and low-vision people in our text-based communication platforms by providing equal access to visual language. These solutions are powered by data-driven approaches, but it can be challenging to collect sufficient data from minority disabled populations. Our methods overcome data scarcity challenges by making use of other populations, such as student language learners and crowdsourcing workers. In this talk, I will present three projects that embody this approach: 1) a crowdsourced American Sign Language (ASL) dictionary, 2) smartfonts and livefonts, alternate scripts that improve legibility for low-vision English readers, and 3) the first animated ASL character system.

When: 1 May 2017 - 10:00am Where: CSE 303

Title: Sim2Real Collision Avoidance for Indoor Navigation of Mobile Robots via Deep Reinforcement Learning

Supervisory Committee: Sergey Levine (Chair), Behcet Acikmese (GSR, Aeronautics and Astronautics), Steve Seitz, Emo Todorov, and Larry Zitnick (Facebook)

Abstract: Deep reinforcement learning has emerged as a promising technique for automatically acquiring control policies that can process raw sensory inputs and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. Here, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world without a single real training instance? We propose a learning method that we call (CAD)2RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely in simulation, with a Monte Carlo policy evaluation algorithm that directly optimizes the network’s ability to produce collision-free path. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world.

When: 25 Apr 2017 - 1:30pm Where: Sieg 322

## Chenglong Wang, Qualifying Project Presentation

Title: Synthesizing Highly Expressive SQL Queries from Input-Output Example

Advisors: Ras Bodik and Alvin Cheung

Abstract: SQL is the de facto language for manipulating relational data. Though powerful, SQL queries can be difficult to write due to their highly expressive constructs. Using the programming-by-example paradigm to help users write SQL queries presents an attractive proposition, as evidenced by online help forums such as Stack Overflow. However, developing techniques to synthesize SQL queries from input-output (I/O) examples has been difficult due to SQL's rich set of operators.

We present a new scalable and efficient algorithm to synthesize SQL queries from I/O examples. Our key innovation is the development of a language for abstract queries, i.e., queries with uninstantiated operators, that can express a large space of SQL queries efficiently. Using abstract queries to represent the search space nicely decomposes the synthesis problem into two tasks: (1) searching for abstract queries that can potentially satisfy the given I/O examples, and (2) instantiating the found abstract queries and ranking the results.

We implemented the algorithm in a new tool, called Scythe, and evaluated it on 193 benchmarks collected from Stack Overflow. Our results showed that Scythe efficiently solved 74% of the benchmarks, most in just a few seconds.

When: 25 Apr 2017 - 12:30pm Where: CSE 403

## Kiana Ehsani, Qualifying Project Presentation

Title: SeGAN: Segmenting and Generating the Invisible

Abstract: Objects often occlude each other in scenes; Inferring their appearance beyond their visible parts plays an important role in scene understanding, depth estimation, object interaction and manipulation. In this paper, we study the challenging problem of completing the appearance of occluded objects. Doing so requires knowing which pixels to paint (segmenting the invisible parts of objects) and what color to paint them (generating the invisible parts). Our proposed novel solution, SeGAN, jointly optimizes for both segmentation and generation of the invisible parts of objects. Our experimental results show that: (a) SeGAN can learn to generate the appearance of the occluded parts of objects; (b) SeGAN outperforms state-of-the-art segmentation baselines for the invisible parts of objects; (c) trained on synthetic photo realistic images, SeGAN can reliably segment natural images; (d) by reasoning about occluder occludee relations, our method can infer depth layering.

When: 18 Apr 2017 - 1:30pm Where: CSE 403

## Victoria Lin, Qualifying Project Presentation

Title: Program synthesis from natural language using recurrent neural networks

Advisors: Luke Zettlemoyer and Michael Ernst

Abstract: Even if a competent programmer knows what she wants to do and can describe it in English, it can still be difficult to write code to achieve the goal. Existing resources, such as question-and-answer websites, tabulate specific operations that someone has wanted to perform in the past, but they are not effective in generalizing to new tasks, to compound tasks that require combining previous questions, or sometimes even to variations of listed tasks.

Our goal is to make programming easier and more productive by letting programmers use their own words and concepts to express the intended operation, rather than forcing them to accommodate the machine by memorizing its grammar. We have built a system that lets a programmer describe a desired operation in natural language, then automatically translates it to a programming language for review and approval by the programmer. Our system, Tellina, does the translation using recurrent neural networks (RNNs), a state-of-the-art natural language processing technique that we augmented with slot (argument) filling and other enhancements.

We evaluated Tellina in the context of shell scripting. We trained Tellina's RNNs on textual descriptions of file system operations and bash one-liners, scraped from the web. Although recovering completely correct commands is challenging, Tellina achieves top-3 accuracy of 80% for producing the correct command structure. In a controlled study, programmers who had access to Tellina outperformed those who did not, even when Tellina's predictions were not completely correct, to a statistically significant degree.

When: 28 Mar 2017 - 12:30pm Where: CSE 691

## Scott Lundberg, General Examination

Title: Actionable machine learning in genetics, personalized medicine, and health

Supervisory Committee: Su-In Lee (Chair), Meliha Yetisgen (GSR, Biomedical Informatics), Ali Shojaie (Biostatistics, Public Health), and Walter Ruzzo

Abstract: Modern machine learning is changing how we approach biology, medicine, and health. However, a key challenge is to build models that produce actionable insights rather than opaque predictions. To support this we propose four methods that learn interpretable, and hence actionable, machine learning models:

1) ChromNet learns the network structure of interactions among proteins in the human cell nucleus, and has provided effective visualization of these results to over 5,000 users so far. This allows us to understand which proteins collaborate to manage our DNA.

2) SHAP unifies several methods designed to provide additive explanations of any machine learning model. It uses an axiom-based approach to justify why a single explanation can be applied to models of any type.

3) Prescience learns an ensemble model for the prediction of hypoxia in the operating room and then uses a SHAP-based approach to explain the prediction. This demonstrates the use of a complex model while still allowing a doctor to understand what causes a patient's risk.

4) Study Flow will enable us to combine multiple models from different studies while retaining the relevant causal assumptions. By encoding the causal assumptions of each model, we can automated large parts of the systematic review process.

When: 17 Mar 2017 - 3:00pm Where: CSE 128

## Lauren Milne, General Examination

Title: Making Education More Accessible for Blind Children Using Touchscreens

Supervisory Committee: Richard Ladner (Chair), Katherine Steele (GSR, ME), Steve Tanimoto, James Fogarty, and Andy Ko (iSchool)

Abstract: Many digital educational tools for elementary school-aged children are highly visual in nature. While using graphics as opposed to purely text can make the applications more appealing and easier to use for children that are developing literacy skills, it also makes them largely inaccessible for blind children. For my dissertation, I propose using touchscreen devices such as smartphones and tablets to make two kinds of educational applications, early literacy and early programming, accessible to blind children. In my previous work, I explored using the spatial layout of a smartphone to create games to teach Braille characters using haptic and audio feedback. In my current and future work, I am studying how to make block-based programming languages, used to teach programming concepts to children, and their accompanying output accessible to blind children using touchscreens.

When: 13 Mar 2017 - 2:00pm Where: CSE 303

## Trevor Perrier, General Examination

Title: Connecting End Users to Domain Experts With Universal Mobile Phones Services

Supervisory Committee: Richard Anderson (Chair), Sarah Odell Gimbel-Sherr (GSR, Global Health), Kurtis Heimerl, and Katharina Reinecke

Abstract: The potential for the communication services available on any mobile phone to engage users and facilitate behavior change has been well documented. Designing programs for individual empowerment and behavior change using SMS and voice services allows projects to reach marginalized segments of the global population. The majority of mobile messaging services for development are either one-way push messaging or use fully automated bidirectional end points. While these projects are easy to scale, reaching large numbers of people with limited human resources, they have difficulty addressing personalized needs. In addition as the size of a project increases it becomes important to plan and adapt to unexpected health concerns and outcomes. In this work we examine the implications, limitations, and feasibility of semi-automated bi-directional mobile messaging applications for global health. Our research aims induce 1) to better understand how end users perceive and understand specific personalized health information, 2) design and build tools that allow domain experts to more efficiently and effectively communicate with end users, 3) explore the feasibility of integrating semi-automated messaging into the current standard of care. Specifically, we will iteratively design, build, and deploy a platform targeting maternal and neonatal health outcomes in peri-urban and rural Kenya. This work establishes a research agenda answering these questions with data from three trails utilizing the platform and the planned deployment of a fourth.

When: 9 Mar 2017 - 9:00am Where: CSE 503

Title: Understanding Data Cleaning Pipeline for Development Data

Supervisory Committee: Richard Anderson (Chair), C Leigh Anderson (GSR, Evan's School), Kurtis Heimerl, and Abraham Flaxman (Global Health)

Abstract: The developing world is relying more and more on data driven policies. There are numerous development agencies that have pushed for on-ground data collection to support the development work they are pursuing. Many governments have launched their own efforts for more frequent information gathering. Overall, the amount of data collected is tremendous, yet we face significant issues in doing useful analysis. Most of these barriers are around data cleaning and merging, and they require a data engineer to support some parts of the analysis. In this thesis, we aim to understand the pain points of cleaning the development data and propose solutions that harness the thinking of a data engineer and reduce the manual workload of this tedious process. To achieve this goal, I am currently focused on two research directions: (1) understanding current data usage patterns and build a taxonomy of data cleaning in developing world; and (2) Building algorithms to support automated data cleaning, targeting selected problems in the space including matching transliterated names. The objective is to empower regular data users to easily do the necessary data cleaning and scrubbing for analysis.

When: 8 Mar 2017 - 12:00pm Where: CSE 305

## Konstantin Weitz, Final Examination

Title: Formal Semantics and Scalable Verification for the Border Gateway Protocol using Proof Assistants and SMT Solvers

Advisors: Michael Ernst and Zach Tatlock

Supervisory Committee: Michael Ernst (co-Chair), Zach Tatlock (co-Chair), Silas Beane (GSR, Physics), and Arvind Krishnamurthy

Abstract: To reliably and securely route traffic across the Internet, Internet Service Providers (ISPs) must configure their Border Gateway Protocol (BGP) routers to implement policies restricting how routing information can be exchanged with other ISPs.  Correctly implementing these policies in low-level router configuration languages, with configuration code distributed across all of an ISP's routers, has proven challenging in practice, and misconfiguration has led to extended worldwide outages and traffic hijacks.

We present Bagpipe, the first system that enables ISPs to declaratively specify control-plane policies and that automatically verifies that router configurations implement such policies using an SMT solver.

We evaluated the expressiveness of Bagpipe's policy specification language on 10 configuration scenarios from the Juniper TechLibrary, and evaluated the efficiency of Bagpipe on three ASes with a total of over 240,000 lines of Cisco and Juniper BGP configuration. Bagpipe revealed 19 policy violations without issuing any false positives.

To ensure Bagpipe correctly checks configurations, we verified its implementation in Coq, which required developing both the first formal semantics for BGP based on RFC~4271; and SpaceSearch, a new framework for verifying solver-aided tools.

We provide evidence for the correctness and usefulness of our BGP semantics by verifying Bagpipe, formalizing Gao and Rexford's pen-and-paper proof on the stability of BGP (this proof required a necessary extension to the original proof), and performing random differential testing of C-BGP (a BGP simulator) revealing 2 bugs in C-BGP, but none in our BGP semantics.

We provide evidence for the general usefulness of SpaceSearch, by building and verifying two solver-aided tools. The first tool is Bagpipe, the second tool, SaltShaker, checks that RockSalt's x86 semantics for a given instruction agrees with STOKE's x86 semantics. SaltShaker identified 7 bugs in RockSalt and 1 bug in STOKE. After these systems were patched by their developers, SaltShaker verified the semantics' agreement on 15,255 instructions in under 2h.

When: 7 Mar 2017 - 12:00pm Where: CSE 503

## John Toman, General Examination

Title: Learning to Adapt: Analyses for Configurable Software

Supervisory Committee: Dan Grossman (Chair), Cecilia Aragon (GSR, HCDE), Zachary Tatlock, and Ras Bodik

Abstract: Configurations are found at all levels of the software stack: from kernel options to configuration files in MapReduce. However, ubiquitous configurations are not without consequences; these negative effects impact developers, end-users, and tool designers. Users must find and set the correct configuration options to achieve the behavior they want: failing to set options correctly can lead to unexpected behavior or crashes. Developers must also contend with large spaces of configurations and non-obvious interaction of features. Tool designers must contend with configurable applications, libraries and frameworks, where information about runtime behavior is spread across multiple artifacts including code annotations, configuration files, and more.

In this talk, I propose a dissertation that explores improving software quality by building analyses for configurable software. I motivate and propose two hypotheses that my dissertation will explore. I review related work in the area of my future dissertation work and briefly summarize my prior research in the area of configurable software.  Finally, I sketch my future dissertation work and work I plan to pursue after graduation.

When: 6 Mar 2017 - 2:30pm Where: CSE 403

## Maxim Grechkin, General Examination

Title: Deep Curation: An Unsupervised Approach to Biological Data Curation

Supervisory Committee: Bill Howe (Chair), David Ribes (GSR, HCDE), Hoifung Poon (MSR), and Larry Ruzzo

Abstract: Public repositories such as Gene Expression Omnibus (GEO) have seen rapid accumulation of biological data. However, high-quality and consistent annotation is generally unavailable, which severely limits the use of multiple datasets for new discoveries, reproducibility, and other computational tasks. Previous attempts to automate the curation task require hand-labeled training data, which is not generally available and must be reacquired whenever the ontology that provides the class labels change.

We propose a new method that learns tissue type labels for GEO datasets with no training labels. We learn two classifiers in tandem, one over the free-text description for a dataset and another over the raw microarray signal itself, and having the two classifiers train each other iteratively. We applied this method to  GEO to produce an expression-based classifier that outperforms the state-of-the-art supervised-learning method in accuracy without any hand-labeled training data.

When: 3 Mar 2017 - 12:00pm Where: CSE 403

## Greg Nelson, Qualifying Project Presentation

Title: Learning To Read Code First: Evaluating a Novel Pedagogy for Learning Programming with an Interactive Visual Tutoring System

Advisors: Andy Ko (iSchool) and Steve Tanimoto

Abstract: What knowledge does learning programming require? Prior work has focused on theorizing and teaching writing and problem solving skills. We reconceptualize program reading as a separate skill from writing, and propose a novel formal theory of program tracing knowledge based on symbolic execution. Within program reading, we separate  program literary comprehension and program behavior comprehension (which includes program tracing), and provide a new explanation for the “computer is like a person” misconception among novices. We use this theory to argue for initially separating the learning of reading and writing, as well as carefully delineating them throughout learning programming. Based on our formal theory of program tracing knowledge, we propose pedagogical principles for teaching program tracing within a program reading curriculum. To assess learning within this new pedagogy, we build a new tutorial system - PLTutor - with novel technical capabilities that enable 1) curricular flexibility 2) assessments of reading knowledge at multiple levels of granularity 3) visualization of program--instruction--machine-state relationships with code token and sub-expression granularity. We evaluate learning gains among self-selected CS1 students using a block randomized controlled lab study comparing PLTutor with Codecademy(a writing tutorial). We cautiously interpret our results due to small sample size and assessment validity concerns. We find some evidence of improved learning gains on the SCS1, with average learning gains 66% higher than Codecademy (3.94 vs. 2.37 out of 27 questions).

When: 23 Feb 2017 - 11:15am Where: CSE 110

## Kyle Thayer, Qualifying Project Presentation

Title: Barriers Faced by Coding Bootcamp Students

Advisors: Andy Ko (iSchool) and Katharina Reinecke

Abstract: Coding bootcamps are a new and understudied way of training new software developers. We interviewed twenty-five current or former coding bootcamp students to learn about the barriers coding bootcamp students face. We used the Communities of Practice framework to analyze barriers in both the students' relation to coding bootcamps and their relation to the software industry. We found that bootcamps offer an alternate path into the software industry, providing a second chance for those who missed programming earlier, particularly for women who had not considered programming a possibility or had been too intimidated to try. As in other computing education environments, bootcamp students faced issues around "fitting in" and divisions between those who "get it," and those who don't. Bootcamps came with a great personal cost to bootcamp students, often including significant time and resources used outside the bootcamp to get into the software industry, though they sometimes expressed uncertainty of what was needed to make this transition successfully.

When: 22 Feb 2017 - 10:30am Where: CSE 303

## Ellis Michael, Qualifying Project Presentation

Title: Fault-Tolerant Distributed Protocols in Ordered Networks

Advisors: Dan Ports and Tom Anderson

Abstract: Protocols for fault-tolerant distributed systems typically assume a completely asynchronous network, in which messages can be dropped or reordered arbitrarily. Guaranteeing the safety of client-server based distributed storage systems in this context involves guaranteeing two distinct properties: order of state updates and their durability. This work shows that in single datacenters, separating these two concerns and guaranteeing ordering in the network using the capabilities of upcoming programmable switches can avoid most of the latency and throughput costs of replication and consistency. This work studies two related problems---state machine replication and distributed transaction processing---and presents two protocols---NOPaxos and Eris---which leverage different network ordering properties to deliver drastically improved normal case performance.

When: 21 Feb 2017 - 1:30pm Where: CSE 403

## Maaz Ahmad, Qualifying Project Presentation

Title: Optimizing Large Scale Data Intensive Applications Using Verified Lifting

Advisors: Alvin Cheung and Ras Bodik

AbstractWe present a tool called Casper that enables sequential data-intensive programs to automatically leverage the optimizations provided by parallel data processing frameworks. Casper works by lifting sequential code fragments to a high-level domain-specific language that can be easily retargeted to parallel data processing frameworks based on the MapReduce programming model. We use a novel cost-based syntax-guided synthesis algorithm and program verification to find sound functional MapReduce representations of the input code. Our Casper prototype currently targets sequential code fragments written in Java and retargets them to Apache Spark. Our evaluation shows that Casper can translate 46 benchmarks from 7 different suites, with the generated code performing on average 13x faster compared to the original sequential implementation.

When: 17 Feb 2017 - 3:30pm Where: CSE 403

## Aaron Walsman, Qualifying Project Presentation

Title: Tracking Articulated Objects with a High-Resolution Deformable Surface Model

Advisors: Dieter Fox and Brian Curless

Abstract: The last several years have seen significant progress in using depth cameras for tracking articulated objects such as human bodies, hands, or robotic manipulators.  Most approaches focus on tracking skeletal parameters of a fixed shape model, which makes them insufficient for robotics applications that require accurate estimates of object surfaces.  To overcome this limitation, we present a 3D model-based tracking system for articulated deformable objects.  Our system is able to track human body pose and high resolution surface contours in real time using a commodity depth sensor and GPU hardware.  We implement this as a joint optimization over a skeleton to account for changes in pose, and over the vertices of a high resolution mesh to track the subject's shape. With this model we are able to capture dynamic sub-centimeter surface detail such as face deformations and folds and wrinkles in clothing.  The end result is highly accurate spatiotemporal and semantic information which is well suited for physical human robot interaction.

When: 16 Feb 2017 - 10:00am Where: CSE 303

## Sam Castle, Qualifying Project Presentation

Title: Addressing the Challenges of Fraud in Financial Services for Emerging Digital Economies

Advisors: Richard Anderson and Franziska Roesner

Abstract: According to the World Bank, more than 2 billion people worldwide have no access to any financial services. In the developing world, where traditional banking infrastructure is limited, 8 out of 10 people now own a mobile phone, which can be used to save, send, and borrow money. Where there is money, there must be security, yet prior work on mobile money identified discouraging vulnerabilities in the current ecosystem. We assess these security concerns from two directions: (1) the risk introduced by existing vulnerabilities and (2) the actual exploits which occur in practice. We begin by defining a systematic threat model for this domain and applying it throughout our work. To understand existing vulnerabilities, we conducted a large-scale analysis of 197 current Android deployments and examined 71 products in more detail. After detecting substantial potential vulnerabilities, we interviewed 7 software developers in Africa and South America to investigate the root causes of these issues. We found that competent developers consistently lack relevant security tools, and we propose a way forward for designing such tools. In parallel research we attempted to measure the prevalence of exploits occurring in practice. Based on anecdotal evidence from users, mobile operators, and news sources, social engineering is the most common attack on mobile money users. We collected a new corpus of data on phishing over SMS, and we present an initial evaluation of this phenomenon with proposals to mitigate the risk of fraud.

When: 14 Feb 2017 - 12:00pm Where: CSE 203

## Sarah Elliott, Qualifying Project Presentation

Title: Using Learned Transition Models to Rearrange Dirt on a Surface

Advisors: Maya Cakmak and Dieter Fox

Abstract: We address the problem of enabling a robot manipulator to move arbitrary amounts and configurations of dirt on a surface to a goal region using a cleaning tool. We represent this problem as heuristic search with a set of primitive dirt-oriented tool actions. We present dirt and action representations that allow efficient learning and prediction of future dirt states, given the current dirt state and applied action. We also present a method for sampling promising tool actions based on a clustering of dirt states and heuristics for planning. We demonstrate the effectiveness of our approach on challenging cleaning tasks through an implementation on the PR2 robot.

When: 14 Feb 2017 - 11:00am Where: CSE 503

## Liang Luo, Qualifying Project Presentation

Title: INA: Accelerating DNN Training via In-Network Aggregation

Abstract: Recent growth of DNN development outpaces the growth in network bandwidth and GPU raw power. This causes the experiment time for finding the right neural network model to explode from a few days on a single machine to several weeks even months on a cluster with thousands of machines. DNN Frameworks such as MXNET enabled training DNNs on cluster or datacenter level to bring down training time. However, even with aggressive optimizations built-in, we found deficiency when scaling to more machines and deeper neural networks which stops the system throughput from scaling linearly, due to frequent parameter updates in the physical network, limited physical network and parameter server capacity.

This talk characterizes a detailed DNN training of today, pinpoints the current bottleneck in large scale DNN training, proposes principles and designs of In-Network Aggregation (INA) for accelerating datacenter level DNN workloads by exploiting parallel parameter aggregation right inside the network, in a hierarchical manner. INA removes the bottleneck in parameter synchronization by reducing both required network bandwidth and latency for DNN training with techniques such as in-place parallel stream aggregation and merging. We also project the effectiveness of INA using an INA model, which sees a sizable improvement in training time for RESNET-269 when compared with current state-of-the-art. Our results show the feasibility of running much larger DNNs with INA on existing datacenter infrastructures than they currently can handle.

When: 13 Feb 2017 - 11:00am Where: CSE 203