Food at 2015-6 UW NLP talks is generously provided by Amazon.
Tuesday, February 2, 2016 - 12:00
Structured Learning Algorithms for Entity Linking and Semantic Parsing
Tuesday, June 2, 2015 - 12:30
Statistical machine learning methods for the analysis of large networks
Edo Airoldi received a PhD from Carnegie Mellon University in 2007, working at the intersection of statistical machine learning and computational social science with Stephen Fienberg and Kathleen Carley. His PhD thesis explored modeling approaches and inference strategies for analyzing social and biological networks. Until December 2008, he was a postdoctoral fellow in the Lewis-Sigler Institute for Integrative Genomics and the Department of Computer Science at Princeton University working with Olga Troyanskaya and David Botstein. They developed mechanistic models of regulation, leveraging of high-thoughput technology, to gain insights into aspects of cellular dynamics that are not directly measurable at the desired resolution, such as growth rate. He joined the Statistics Department at Harvard University in 2009.
Tuesday, May 19, 2015 - 12:30
Diverse Particle Selection for High-Dimensional Inference in Graphical Models
Rich graphical models for real-world scene understanding encode the shape and pose of objects via high-dimensional, continuous variables. We describe a particle-based max-product inference algorithm which maintains a diverse set of posterior mode hypotheses, and is robust to initialization. At each iteration, the set of particle hypotheses is augmented via stochastic proposals, and then reduced via an optimization algorithm that minimizes distortions in max-product messages. Our particle selection metric is submodular, and thus efficient greedy algorithms have rigorous optimality guarantees. By avoiding the stochastic resampling steps underlying standard particle filters, we also avoid common degeneracies where particles collapse onto a single hypothesis. Our approach significantly outperforms previous particle-based algorithms in the estimation of human pose from single images, and the prediction of protein side-chain conformations.
Erik B. Sudderth is an Assistant Professor in the Brown University Department of Computer Science. He received the Bachelor's degree (summa cum laude, 1999) in Electrical Engineering from the University of California, San Diego, and the Master's and Ph.D. degrees (2006) in EECS from the Massachusetts Institute of Technology. His research interests include probabilistic graphical models; nonparametric Bayesian methods; and applications of statistical machine learning in computer vision and the sciences. He received an NSF CAREER award, and was named one of "AI's 10 to Watch" by IEEE Intelligent Systems Magazine
Wednesday, May 6, 2015 - 12:30
Graphical Modeling with the Bethe Approximation
Tuesday, January 27, 2015 - 12:30
Degree, curvature, and mixing of random walks on the phylogenetic subtree-prune-regraft graph, and what it tells us about phylogenetic inference via MCMC
Tuesday, January 13, 2015 - 12:30
Driving Time Variability Prediction Using Mobile Phone Location Data
We introduce a method to predict the variability in (probability distribution of) driving time on an arbitrary route in a road network at a given time, using mobile phone GPS data. Although commercial mapping services currently provide a high-quality estimate of driving time on a given route, there can be considerable uncertainty in that prediction due for example to unknown timing of traffic signals, uncertainties in traffic congestion levels, and differences in driver habits. For this reason, a distribution prediction can be more valuable than a deterministic prediction of driving time, by accounting not just for the measured traffic conditions and other available information, but also for the presence of unmeasured conditions that also affect driving time. Accurate distribution predictions can be used to report variability to the user, to provide risk-averse route recommendations, and as a part of vehicle fleet decision support systems. Simple approaches to distribution prediction assume independence in driving time across road segments and as a result dramatically underestimate the variability in driving time. We propose a method that accurately accounts for dependencies in
driving time across road segments, and apply it to large volumes of mobile phone GPS data from the Seattle metropolitan region.
Tuesday, November 4, 2014 - 12:30
Thursday, October 30, 2014 - 12:30
Deep Representation Learning: Challenges and New Directions
Machine learning is a powerful tool for tackling challenging problems
in artificial intelligence. In practice, success of machine learning
algorithms critically depends on the feature representations for input
data, which often becomes a limiting factor. To address this problem,
deep learning methods have recently emerged as successful techniques
to learn feature hierarchies from unlabeled and labeled data. In this
talk, I will present my perspectives on the progress, challenges, and
some new directions. Specifically, I will talk about my recent work to
address the following interrelated challenges: (1) how can we learn
invariant yet discriminative features, and furthermore disentangle
underlying factors of variation to model high-order interactions
between the factors? (2) how can we learn representations of the
output data when the output variables have complex high-order
dependencies? (3) how can we learn shared representations from
heterogeneous input data modalities?
Honglak Lee is an Assistant Professor of Computer Science and
Engineering at the University of Michigan, Ann Arbor. He received his
Ph.D. from Computer Science Department at Stanford University in 2010,
advised by Prof. Andrew Ng. His primary research interests lie in
machine learning, which spans over deep learning, unsupervised and
semi-supervised learning, transfer learning, graphical models, and
optimization. He also works on application problems in computer
vision, audio recognition, robot perception, and text processing. His
work received best paper awards at ICML and CEAS. He has served as a
guest editor of IEEE TPAMI Special Issue on Learning Deep
Architectures, as well as area chairs of ICML and NIPS. He received
the Google Faculty Research Award in 2011, and was selected by IEEE
Intelligent Systems as one of AI's 10 to Watch in 2013.
Tuesday, October 21, 2014 - 12:30
Massive, Sparse, Efficient Multilabel Learning
Amazon has many applications whose core is multilabel
classification. This talk will present progress towards a multilabel
learning method that can handle 10^7 training examples, 10^6 features, and
10^5 labels on a single workstation. A sparse linear model is learned for
each label simultaneously by stochastic gradient descent with L2 and L1
regularization. Tractability is achieved through careful use of sparse data
structures, and speed is achieved by using the latest stochastic gradient
methods that do variance reduction. Both theoretically and practically,
these methods achieve order-of-magnitude faster convergence than Adagrad.
We have extended them to handle non-differentiable L1 regularization. We
show experimental results on classifying biomedical articles into 26,853
scientific categories. [Joint work with Galen Andrew, ML intern at Amazon.]
Bio Charles Elkan is the first Amazon Fellow, on leave from being a
professor of computer science at the University of California, San Diego.
In the past, he has been a visiting associate professor at Harvard and a
researcher at MIT. His published research has been mainly in machine
learning, data science, and computational biology. The MEME algorithm that
he developed with Ph.D. students has been used in over 3000 published
research projects in biology and computer science. He is fortunate to have
had inspiring undergraduate and graduate students who are in leadership
positions now such as vice president at Google.
Tuesday, October 7, 2014 - 12:30
Learning Mixtures of Ranking Models
Probabilistic modeling of ranking data is an extensively studied
problem with applications ranging from understanding user preferences
in electoral systems and social choice theory, to more modern learning
tasks in online web search, crowd-sourcing and recommendation
systems. This work concerns learning the Mallows model -- one of the
most popular probabilistic models for analyzing ranking data. In this
model, the user's preference ranking is generated as a noisy version
of an unknown central base ranking. The learning task is to recover
the base ranking and the model parameters using access to noisy
rankings generated from the model.
Although well understood in the setting of a homogeneous population (a
single base ranking), the case of a heterogeneous population (mixture
of multiple base rankings) has so far resisted algorithms with
guarantees on worst case instances. In this talk I will present the
first polynomial time algorithm which provably learns the parameters
and the unknown base rankings of a mixture of two Mallows models. A
key component of our algorithm is a novel use of tensor decomposition
techniques to learn the top-k prefix in both the rankings. Before this
work, even the question of identifiability in the case of a mixture of
two Mallows models was unresolved.
Joint work with Avrim Blum, Or Sheffet and Aravindan Vijayaraghavan.