Machine Learning Seminars

Sponsored by

Title: Density Tree and Density Ranking in Singular Measures
Speaker: Yen-Chi Chen
When: Tuesday, October 17, 2017 - 12:00
Location: CSE 403
A density tree (also known as a cluster tree of a probability density function) is a tool in topological data analysis that uses a tree structure to represent the shape of a density function. Even if the density function is multivariate, a density tree can always be displayed on a two-dimensional plane, making it an ideal tool for visualizing the shape of a multivariate dataset. However, in complex datasets such as GPS data, the underlying distribution function is singular so the usual density function and density tree no longer exist. To analyze this type of data and generalize the density tree, we introduce the concept of density ranking and ranking tree (also called an alpha-tree). We then show that one can consistently estimate the density ranking and the ranking tree using a kernel density estimator. Based on the density ranking, we introduce several geometric and topological summary curves for analyzing GPS datasets. Bio: Yen-Chi Chen is an Assistant Professor in the Department of Statistics, a data science fellow in the eScience Institute, and a statistician in the National Alzheimer's Coordinating Center at the University of Washington. Yen-Chi obtained a PhD in Statistics from Carnegie Mellon University in June 2016, advised by Professor Larry Wasserman and Christopher Genovese. His current research is on nonparametric statistics, topological data analysis, cluster analysis, and applications in Astronomy, industrial engineering, and medical research.


Title: Causal Learning
Speaker: Bernhard Schölkopf
When: Wednesday, August 16, 2017 - 15:00
Location: CSE 305
In machine learning, we use data to automatically find dependences in the world, with the goal of predicting future observations. Most machine learning methods build on statistics, but one can also try to go beyond this, assaying causal structures underlying statistical dependences. Can such causal knowledge help prediction in machine learning tasks? We argue that this is indeed the case, due to the fact that causal models are more robust to changes that occur in real world datasets. We discuss implications of causality for machine learning tasks, and argue that many of the hard issues benefit from the causal viewpoint. This includes domain adaptation, semi-supervised learning, transfer, life-long learning, and fairness, as well as an application to the removal of systematic errors in astronomical problems.

Semantic Scholar page: Bernhard Schölkopf


Title: Revisiting the PC-Algorithm: How Much Conditioning Is Needed?
Speaker: Ali Shojaie
When: Tuesday, May 9, 2017 - 12:00
Location: CSE 305
The PC-Algorithm, named after its inventors, Peter Sprites and Clark Glymour, is the gold standard for learning directed acyclic graphs (DAGs) from observational data. The algorithm was recently popularized by Kalisch and Bulhmann for learning sparse high-dimensional DAGs, containing many nodes, from smaller number of observations. However, the computational and sample complexity of the PC-Algorithm scale with the maximum degree of the network. The maximum degree grows with the number of nodes in many random graph families, including power-law graphs commonly observed in real-world networks. The algorithm is thus inappropriate for learning such networks. Moreover, the PC-Algorithm requires a stringent faithfulness assumption, which has been shown not to hold in high dimensions.

In this work, we exploit properties of large random graphs to show that DAGs can be learned by only conditioning on sets of small cardinality. In other words, for many graph families, we justify early stopping of PC-Algorithm’s search. While guaranteed to learn the skeleton of high-dimensional DAGs, this simple modification also turns out to greatly improve the computational and sample complexity of the PC-Algorithm. The new algorithm also requires a weaker faithfulness assumption and results in improved empirical performance in DAGs with hub nodes.

Semantic Scholar page: Ali Shojaie


Title: Continuous State Machines and Grammars for Linguistic Structure Prediction
Speaker: Noah Smith
When: Tuesday, April 25, 2017 - 12:00
Location: CSE 305
Linguistic structure prediction infers abstract representations of text, like syntax trees and semantic graphs, enabling interpretation in applications like question answering, information extraction, and opinion analysis. This talk is about the latest family of methods for linguistic structure prediction, which make heavy use of representation learning via neural networks. I'll present these new methods as continuous generalizations of state machines and probabilistic grammars. I'll show how they've led to fast and accurate performance on several syntactic and semantic parsing problems.

Semantic scholar page: Noah Smith


Title: Sequence Modeling: From Spectral Methods and Bayesian Nonparametrics to Deep Learning
Speaker: Alex Smola
When: Tuesday, April 11, 2017 - 12:00
Location: MOR 220
In this talk, Alex Smola will summarize a few recent developments in the design and analysis of sequence models. Starting with simple parametric models such as HMMs for sequences we look at nonparametric extensions in terms of their ability to model more fine-grained types of state and transition behavior. In particular we consider spectral embeddings, nonparametric Bayesian models such as the nested Chinese Restaurant Franchise and the Dirichlet-Hawkes Process. We conclude with a discussion of deep sequence models for user return time modeling, time-dependent collaborative filtering, and large-vocabulary user profiling.


Title: Streaming Optimization Methods: Stochastic Gradient Descent, Momentum, and Newton's Method
Speaker: Sham Kakade
When: Tuesday, February 21, 2017 - 12:00
Location: CSE 305
Theory and practice differ widely in our more basic optimization algorithms, such as stochastic gradient descent. For example, for the practically relevant problem of setting parameters (such as learning rates), principled methods are almost never used in practice. Can we get a better a handle of the true behavior this widely used algorithm (as opposed to worst case analysis)? Can we use these insights to provide better algorithms?

This talk by Sham Kakade will survey recent work on this question. In particular, we seek to understand how fast we can parallelize these algorithms using simple averaging and mini-batching procedures. The talk will also discuss ongoing work with regards to streaming methods for momentum (i.e. acceleration), for implementing second order methods, and for non-convex optimization.


Title: Physics-based Visual Reasoning
Speaker: Roozbeh Mottaghi (AI2)
When: Tuesday, February 7, 2017 - 12:00
Location: CSE 305
Despite recent progress, AI is still far from understanding the physics of the world, and there is a large gap between the abilities of humans and the state-of-the-art methods. In this talk, I will focus on physics-based scene understanding and visual reasoning, which is a crucial next step in computer vision and AI.

Bio: Roozbeh Mottaghi is a Research Scientist at Allen Institute for Artificial Intelligence (AI2). Prior to joining AI2, he was a post-doctoral researcher at the Computer Science Department at Stanford University. He obtained his PhD in Computer Science in 2013 from University of California, Los Angeles. His research is mainly focused on computer vision and machine learning.


Title: Interactive and Interpretable Machine Learning Models for Human Machine Collaboration
Speaker: Been Kim (AI2)
When: Tuesday, January 24, 2017 - 12:00
Location: CSE 305
I envision a system that enables successful collaborations between humans and machine learning models by harnessing the relative strength to accomplish what neither can do alone. Machine learning techniques and humans have skills that complement each other --- machine learning techniques are good at computation on data at the lowest level of granularity, whereas people are better at abstracting knowledge from their experience, and transferring the knowledge across domains. The goal of my research is to develop a framework for human-in-the-loop machine learning that enables people to interact effectively with machine learning models to make better decisions using large datasets, without requiring in-depth knowledge about machine learning techniques.
In this talk, I present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the “quintessential” observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants’ understanding when using explanations produced by BCM, compared to those given by prior art. I demonstrate the application of this model for an educational domain in which teachers cluster program

Bio: Been Kim is a Research Scientist at Allen Institute for Artificial Intelligence (AI2). She is also an affiliate faculty in the Department of Computer Science & Engineering at the University of Washington. Her research focuses on building interpretable machine learning. The vision of her research is to make humans empowered by machine learning, not overwhelmed by it. She received her PhD. from MIT. Prior to her PhD, she worked at the MathWorks as a software engineer.


Title: Probabilistic programming: representations, inference, and applications to natural language
Speaker: Noah Goodman (Stanford)
When: Friday, January 13, 2017 - 12:00
Location: EEB 037
Probabilistic programing languages (PPLs) are formal languages for probabilistic modeling that describe complex distributions via programs with random choices. As a description language PPLs are a convenient and powerful way to construct models. I will show several examples drawn from cognitive science, focussing on language understanding: I will use a PPL to construct a model of language understanding as social reasoning; this model captures aspects of vague and figurative language. PPL implementations make it possible to separate inference algorithms from model representation by providing universal inference engines. Many techniques have been explored for inference, but they all struggle with efficiency for different model classes. Instead, we have been exploring systems that learn efficient inference strategies. I will discuss “deep amortized inference for PPLs”, a system that optimizes deep neural network “guide programs” to capture the implicit posterior distribution of a probabilistic program.

Bio: Noah Goodman is Associate Professor of Psychology, Computer Science (by courtesy), and Linguistics (by courtesy) at Stanford University. He studies the computational basis of natural and artificial intelligence, merging behavioral experiments with formal methods from statistics and programming languages. His research topics include language understanding, social reasoning, concept learning, probabilistic programming languages, and applications. ProfessorGoodman received his Ph.D. in mathematics from the University of Texas at Austin in 2003. In 2005 he entered cognitive science, working as Postdoc and Research Scientist at MIT. In 2010 he moved to Stanford where he runs the Computation and Cognition Lab. Professor Goodmanhas published more than 150 papers in fields including psychology, linguistics, computer science, and mathematics. His work has been recognized by the James S. McDonnell Foundation Scholar Award, the Roger N. Shepard Distinguished Visiting Scholar Award, the Alfred P. Sloan Research Fellowship in Neuroscience, and six computational modeling prizes from the Cognitive Science Society. He is Fellow of the Uber AI Labs, Academic Co-Founder and Advisor of Gamalon Labs, and advisor to several other start-ups.


Title: Computational and Statistical Issues in Genomics
Speaker: David Heckerman (MSR)
When: Tuesday, November 29, 2016 - 12:00
Location: CSE 305
In the last decade, genomics has seen an explosion in the production of data due to the decreasing costs and processing times associated with DNA sequencing. I will discuss how the cloud as well as techniques from mathematics and computer science help take advantage of this big data. My discussion will include linear mixed models, a popular model for association studies. I will show how these models work well and, if there is time, talk about why they do.