Machine Learning

Machine Learning Seminars

To join the uw-nlp mailing list, to which these talks are announced, please visit this page. The mailing list is open to all.

Food at 2015-6 UW NLP talks is generously provided by Amazon.

Tuesday, May 31, 2016 - 12:00

Semantic Parsing to Probabilistic Programs for Situated Question Answering

Speaker: Jayant Krishnamurthy (AI2)
Location: CSE 305
Existing models for situated question answering make strong independence assumptions that negatively impact their accuracy. These assumptions, while empirically false, are necessary to facilitate inference because the number of joint question/environment interpretations is extremely large, typically superexponential in the number of objects in the environment. We present Parsing to Probabilistic Programs (P3), a novel situated question answering model that embraces approximate inference to eliminate these independence assumptions and enable the use of arbitrary global features of the question/environment interpretation. Our key insight is to treat semantic parses as probabilistic programs that are executed nondeterministically, and whose possible executions represent environmental uncertainty. We evaluate our approach on a new, publicly-released data set of 5000 diagram questions from a science domain, finding that our approach outperforms several competitive baselines.

Tuesday, May 24, 2016 - 12:00

Learning to Converse: An End-to-End Neural Approach

Speaker: Michel Galley (MSR)
Location: CSE 305
Until recently the goal of training open-domain conversational systems that emulate human conversation has seemed elusive. However, the vast quantities of conversational exchanges now available from social media, instant messaging, and other online resources enable the building of data-driven models that can engage in natural and sustained conversations. In this talk, I will present an open-domain LSTM-based conversational model trained end-to-end from millions of conversations, without any implicit assumptions about dialog structure. I will focus on the technical challenges in applying neural models to conversational data, in particular, (1) overcoming the overwhelming prevalence of bland and safe responses (e.g., "I don't know"), and (2) promoting responses that reflect a consistent persona. Finally, I will overview our current efforts towards more grounded and goal-oriented conversations. If time permits, I will show a demo of our conversational system. This is joint work with Jiwei Li, Alessandro Sordoni, Chris Brockett, Jianfeng Gao, and Bill Dolan.

Tuesday, May 3, 2016 - 12:00

Supersizing Self-Supervision: ConvNets and Common sense without manual supervision

Speaker: Abhinav Gupta (CMU)
Location: CSE 305
In this talk, I will discuss how to learn visual representation and common sense knowledge without using any manual supervision. First, I am going to discuss how we can learn ConvNets in a completely unsupervised manner using auxiliary tasks. Specifically, I am going to demonstrate how spatial context in images and viewpoint changes in videos can be used to train visual representations. Then, I am going to introduce NEIL (Never Ending Image Learner), a computer program that runs 24x7 to automatically build visual detectors and common sense knowledge from web data. NEIL is an attempt to develop a large and rich visual knowledge base with minimum human labeling effort. Every day, NEIL scans through images of our mundane world, and little by little, it learns common sense relationships about our world. For example, with no input from humans, NEIL can tell you that trading floors are crowded and babies have eyes. In eight months, NEIL has analyzed more than 25 million images, labeled ~4M annotations (boxes and segments), learned models for 7500 concepts and discovered more than 20K common sense relationships. Finally, in an effort to diversify the knowledge base, I will briefly discuss how NEIL is also being extended to a physical robot which learns about knowledge for actions.

Tuesday, April 19, 2016 - 12:00

Deep Robotic Learning

Speaker: Sergey Levine
Location: CSE 305
Humans and animals have a remarkable ability to autonomously acquire new behaviors. My work is concerned with designing algorithms that aim to bring this ability to robots and other autonomous systems that must make decisions in complex, unstructured environments. A central challenge that such algorithms must address is to learn behaviors with representations that are sufficiently general and expressive to handle the wide range of motion skills that are needed for real-world applications. This requires processing complex, high-dimensional inputs and outputs, such as camera images and joint torques, and providing considerable generality to various physical platforms and behaviors. I will present some of my recent work on policy learning, demonstrating that complex, expressive policies represented by deep neural networks can be used to learn controllers for a wide range of robotic platforms, including dexterous hands, autonomous aerial vehicles, simulated bipedal walkers, and robotic arms. I will show how deep convolutional neural networks can be trained to directly learn policies that combine visual perception and control, acquiring the entire mapping from rich visual stimuli to motor torques on a PR2 robot. I will also present some recent work on scaling up deep robotic learning on a cluster consisting of multiple robotic arms, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination.

Friday, April 1, 2016 - 12:00

Machine learning approach to identify novel cancer therapeutic targets

Speaker: Su-In Lee
Location: CSE 305
Cancer is full of mysteries. Two individuals with seemingly similar tumors sometimes have very different responses to chemotherapy and other treatments, as well as drastically different survival outcomes. In order to better understand this phenomenon, researchers have developed ways to obtain a molecular snapshot of an individual’s tumor, which often contains at least tens of thousands of measurements. Analyzing these molecular data from cancer patients holds a great promise to identify novel therapeutic targets, however involves significant challenges caused by high-dimensionality of data (i.e., p>>n) and discrepancies of measurements across different studies. In this talk, I will present machine learning techniques we recently developed to resolve these challenges. These methods learn low-dimensional network features – such as modules, densely connected sub-networks and perturbed nodes across conditions – that are likely to represent important molecular events in the disease process in an unsupervised fashion, based on molecular profiles from multiple populations of cancer patients. In collaboration with UW Pathology, UW Genome Sciences and Stanford Center for Cancer Systems Biology, we made novel discoveries that can lead to better ways to treat cancer.

Tuesday, March 1, 2016 - 12:00

Statistical Methods for Differential Network Analysis

Speaker: Ali Shojaie (UW)
Location: CSE 305
Recent evidence suggests that changes in biological networks, e.g., rewiring or disruption of key interactions, may be associated with development of complex diseases. These findings have motivated new research initiatives in computational and experimental biology that aim to obtain condition-specific estimates of biological networks, e.g. for normal and tumor samples, and identify differential patterns of connectivity in such networks, known as "differential network analysis”. In this talk, I will present new methods based on graphical models and statistical learning, to jointly learn networks of interactions among components of biological systems in heterogeneous populations, and to formally test whether the observed differences in interaction patterns are statistically significant or are due to randomness in estimation procedures.

Tuesday, February 16, 2016 - 12:00

Deep Neural Network for Fast Object Detection and Newtonian Image Understanding

Speaker: Mohammad Rastegari (AI2)
Location: CSE 305
In this talk, I introduce G-CNN, an object detection technique based on Convolutional Neural Networks (CNNs), which works without proposal algorithms. G-CNN starts with a multi-scale grid of fixed bounding boxes. We train a regressor to move and scale elements of the grid towards objects iteratively. G-CNN models the problem of object detection as finding a path from a fixed grid to boxes tightly surrounding the objects. G-CNN with around 180 boxes in a multi-scale grid performs comparably to Fast R-CNN which uses around 2K bounding boxes generated with a proposal technique. This strategy makes detection faster by removing the object proposal stage as well as reducing the number of boxes to be processed. Next, I discuss on a challenging problem of predicting the dynamics of objects in static images. Given a query object in an image, our goal is to provide a physical understanding of the object in terms of the forces acting upon it and its long term motion as response to those forces. Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging. We define intermediate physical abstractions called Newtonian scenarios and introduce Newtonian Neural Network (N3) that learns to map a single image to a state in a Newtonian scenario.

Tuesday, February 2, 2016 - 12:00

Structured Learning Algorithms for Entity Linking and Semantic Parsing

Speaker: Ming-Wei Chang, Microsoft Research
Location: CSE 305
Recent advances on natural language processing have made a lot of impact to the fields of information extraction and retrieval. Specifically, tasks like entity linking and semantic parsing are crucial to search engine applications such as question answering and document understanding. These tasks often require structured learning models, which make predictions on multiple interdependent variables. In this talk, we argue that carefully designed structured learning algorithms play a crucial role for entity linking and semantic parsing. In particular, we will first present structured learning models for entity linking, where the models jointly detect mentions and disambiguate entities. We then show that a novel staged search procedure for question semantic parsing can significantly improve knowledge base question answering systems. Finally, I will discuss some challenges and opportunities for machine learning techniques for general semantic parsing tasks.

Tuesday, January 12, 2016 - 12:00

Discovering Hidden Structure in the Sparse Regime

Speaker: Sham Kakade
Location: Gates Commons
In many applications, we face the challenge of modeling the hidden interactions between multiple observations (e.g. discovering clusters of points in space or learning topics in documents). Furthermore, an added difficulty is that our datasets often have empirical distributions which are heavy tailed tailed (e.g. problems in natural language processing). In other words, even though we have large datasets, we are often in a sparse regime where there is a large fraction of our items that have only been observed a few times (e.g. Zipf's law which states that regardless of how big our corpus of text is, a large fraction of the words in our vocabulary will only be observed a few times). The question we consider is how to learn a model of our data when our dataset is large and yet is sparse. We provide an algorithm for learning certain natural latent variable models applicable to this sparse regime, making connections to a body of recent work in sparse random graph theory and community detection. We also discuss the implications to practice.

Tuesday, June 2, 2015 - 12:30

Statistical machine learning methods for the analysis of large networks

Speaker: Edo Airoldi
Location: CSE 305
Network data -- i.e., collections of measurements on pairs, or tuples, of units in a population of interest -- are ubiquitous nowadays in a wide range of machine learning applications, from molecular biology to marketing on social media platforms. Surprisingly, assumptions underlying popular statistical methods are often untenable in the presence of network data. Established machine learning algorithms often break when dealing with combinatorial structure. And the classical notions of variability, sample size and ignorability take unintended connotations. These failures open to door to a number of technical challenges, and to opportunities for introducing new fundamental ideas and for developing new insights. In this talk, I will review open statistical and machine learning problems that arise when dealing with large networks, mostly focusing on modeling and inferential issues, and provide an overview of key technical ideas and recent results and trends.

Edo Airoldi received a PhD from Carnegie Mellon University in 2007, working at the intersection of statistical machine learning and computational social science with Stephen Fienberg and Kathleen Carley. His PhD thesis explored modeling approaches and inference strategies for analyzing social and biological networks. Until December 2008, he was a postdoctoral fellow in the Lewis-Sigler Institute for Integrative Genomics and the Department of Computer Science at Princeton University working with Olga Troyanskaya and David Botstein. They developed mechanistic models of regulation, leveraging of high-thoughput technology, to gain insights into aspects of cellular dynamics that are not directly measurable at the desired resolution, such as growth rate. He joined the Statistics Department at Harvard University in 2009.