Wednesday, January 31, 2018 - 10:30

Speaker: Mohit Bansal, UNC Chapel Hill
Location: CSE 305

Abstract: In this talk, I will discuss my group's recent work on using logically-directed textual entailment knowledge to improve a variety of downstream natural language generation tasks such as video captioning, document summarization, and sentence simplification. First, we employ a many-to-many multi-task learning setup to combine a directed premise-to-entailment generation task (as well as a video-to-video completion task) with the given downstream generation task of multimodal video captioning (where the caption is entailed by the video), achieving significant improvements over the state-of-the-art on multiple datasets and metrics. Next, we employ multiple novel multi-task learning setups to achieve state-of-the-art results on the tasks of automatic document summarization and sentence simplification. Secondly, we optimize for entailment classification scores as sentence-level metric rewards in a reinforcement learning style setup (via annealed policy gradient methods). Our novel multi-reward functions correct the standard phrase-matching metric rewards to only allow for logically-implied partial matches and avoid contradictions, hence substantially improving the conditioned generation results for both video captioning and document summarization. Finally, I will discuss some of the other recent work in our group on image, video, and action based language generation and interaction.


Bio: Dr. Mohit Bansal is an assistant professor in the Computer Science department at University of North Carolina (UNC) Chapel Hill. Prior to this, he was a research assistant professor (3-year endowed position) at TTI-Chicago. He received his PhD from UC Berkeley in 2013 (where he was advised by Dan Klein) and his BTech from IIT Kanpur in 2008. His research interests are in statistical natural language processing and machine learning, with a particular interest in multimodal, grounded, and embodied semantics (i.e., language with vision and speech, for robotics), human-like language generation and Q&A/dialogue, and interpretable and structured deep learning. He is a recipient of the 2017 DARPA Young Faculty Award, 2017 ACL Outstanding Paper Award, 2014 ACL Best Paper Award Honorable Mention, 2016 and 2014 Google Faculty Research Awards, 2016 Bloomberg Data Science Award, 2017 Facebook ParlAI Award, and 2014 IBM Faculty Award. Webpage:

Friday, January 19, 2018 - 10:30

Speaker: Alexander M. Rush, Harvard University
Location: CSE 305

Title: Text Generation in an E2E World

Abstract: Progress in neural machine translation has led to optimism for text generation tasks such as summarization and dialogue, but it has been more difficult to quantify the successes and challenges in this space. In this talk, I will survey some of the recent advances in neural text generation, and present a successful implementation of these techniques for the 2017 E2E NLG challenge (Gehrmann et al, 2018). Despite success on these small scale examples, though, we see that similar models fail to scale to a more realistic data-to-document corpus. Analysis shows systems will need further improvements in discourse modeling, reference, and referring expression generation (Wiseman et al, 2017). Finally, recent research has considered the unsupervised NLG problem in the form of neural style transfer. I will end by showing promising results in this task using a continuous GAN-based autoencoder (Zhao et al 2017).

Bio: Alexander "Sasha" Rush is an assistant professor at Harvard University. His research interest is in ML methods for NLP with recent focus on deep learning for text generation including applications in machine translation, data and document summarization, and diagram-to-text generation, as well as the development of the OpenNMT translation system. His past work focused on structured prediction and combinatorial optimization for NLP. Sasha received his PhD from MIT supervised by Michael Collins and was a postdoc at Facebook NY under Yann LeCun. His work has received four research awards at major NLP conferences.

Monday, January 8, 2018 - 10:30

Speaker: Chris Dyer, Deepmind
Location: CSE 305
Recurrent Neural Networks and Bias in Learning Natural Languages
As universal function approximators, recurrent neural networks are capable of representing any distribution over sequences, given sufficient capacity. Although empirically impressive at learning distributions over natural language sentences, they have a bias toward learning to represent sentences in terms of sequential recency— a poor match for the structural dependency that is characteristic of natural language. I introduce recurrent neural network grammars (RNNGs), which add a latent structural component to sequential RNNs. RNNGs have a bias that is more suitable for representing natural language, and results show that RNNGs are both excellent models of language, as well as predicting syntactic structure.
Chris Dyer is a staff scientist at DeepMind and a consulting faculty member in the School of Computer Science at Carnegie Mellon University. In 2017, he received the Presidential Early Career Award for Scientists and Engineers (PECASE). His work has occasionally been nominated for best paper awards in prestigious NLP venues and has, much more occasionally, won them. He lives in London and, in his spare time, plays cello.

Monday, December 4, 2017 - 10:30

Speaker: Scott Yih, AI2
Location: CSE 305

Title:   Structured Prediction for Semantic Parsing 

Abstract:  Mapping unstructured text to structured meaning representations, semantic parsing covers a wide variety of problems in the domain of natural language understanding and interaction. Common applications include translating human commands to executable programs, as well as question answering when using databases or semi-structured tables as the information source.  Settings of real-world semantic parsing problems, such as the large space of legitimate semantic parses, weak or mixed supervision signals, and complex semantic/syntactic constraints, pose interesting and yet difficult structured prediction challenges.  In this talk, I will give an overview on these technical challenges and present a case study onsequential question answering, answering sequences of simple but inter-related questions using semi-structured tables from Wikipedia. In particular, I will describe our dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search, which effectively leverages the sequential context to outperform state-of-the-art QA systems that are designed to answer highly complex questions. The talk will be concluded with discussion on the open problems and promising directions for future research.

Bio:  Scott Wen-tau Yih is a Principal Research Scientist at Allen Institute for Artificial Intelligence (AI2). His research interests include natural language processing, machine learning and information retrieval.  Yih received his Ph.D. in computer science at the University of Illinois at Urbana-Champaign.  His work on joint inference using integer linear programming (ILP) has been widely adopted in the NLP community for numerous structured prediction problems. Prior to joining AI2, Yih has spent 12 years at Microsoft Research, working on a variety of projects including email spam filtering, keyword extraction and search & ad relevance. His recent work focuses on continuous representations and neural network models, with applications in knowledge base embedding, semantic parsing and question answering. Yih received the best paper award from CoNLL-2011, an outstanding paper award from ACL-2015 and has served as area co-chairs (HLT-NAACL-12, ACL-14, EMNLP-16,17), program co-chairs (CEAS-09, CoNLL-14) and action/associated editors (TACL, JAIR) in recent years. He is also a co-presenter for several tutorials on topics including Semantic Role Labeling (NAACL-HLT-06, AAAI-07), Deep Learning for NLP (SLT-14, NAACL-HLT-15, IJCAI-16), NLP for Precision Medicine (ACL-17).

Monday, November 20, 2017 - 10:30

Speaker: Danqi Chen, Stanford University
Location: CSE 305
Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved, goal of NLP. This task of reading comprehension (i.e., question answering over a passage of text) has received a resurgence of interest, due to the creation of large-scale datasets and well-designed neural network models.
I will talk about how we build simple yet effective models for advancing a machine’s ability at reading comprehension. I’ll focus on explaining the logical structure behind these neural architectures and discussing the capacities of these models as well as their limits.  Next I’ll talk about how we combine state-of-the-art reading comprehension systems with traditional IR modules to build a new generation of open-domain question answering systems. Our system is much simpler than traditional QA systems and able to answer questions efficiently over the full English Wikipedia and shows great promise on multiple QA benchmarks. I’ll conclude with the main challenges and directions for future research.
Danqi Chen is a Ph.D. candidate in Computer Science at Stanford University, advised by Christopher Manning. She works on deep learning for natural language processing, and is particularly interested in the intersection between text understanding and knowledge representation/reasoning. Her research spans from machine comprehension/question answering to knowledge base construction and syntactic parsing, with an emphasis on building principled yet highly effective models. She is a recipient of a Facebook Fellowship, a Microsoft Research Women’s Fellowship and outstanding paper awards at ACL'16 and EMNLP’17. Previously, she received her B.S. with honors from Tsinghua University in 2012.

Friday, November 3, 2017 - 10:30

Speaker: Paul Smolensky, Microsoft Research AI & Johns Hopkins Cognitive Science Dept.
Location: CSE 305

I will summarize an approach to neural network design that enables symbolic structures to be encoded as distributed neural vectors and allows symbolic computations of interest to AI & computational linguistics to be carried out through massively parallel neural operations. The approach leads to new grammar formalisms that are rooted in neural computation; these have had significant impact within linguistic theory (especially phonology). I will present theoretical results as well as recent experimental results with deep learning applying the method to image caption generation and to NLP for question answering. In each case, the model learns aspects of syntax in the service of its NLP task.

Paul Smolensky is a Partner Researcher in the Deep Learning group of the Microsoft Research AI lab in Redmond WA as well as Krieger-Eisenhower Professor of Cognitive Science at Johns Hopkins University in Baltimore MD. His research addresses the unification of symbolic and neural computation with a focus on the theory of grammar. This work led to Optimality Theory, which he co-created with Alan Prince (1993); this is an outgrowth of Harmonic Grammar, which he co-developed with Géraldine Legendre & Yoshiro Miyata (1990). He received the 2005 D. E. Rumelhart Prize for Outstanding Contributions to the Formal Analysis of Human Cognition.

Friday, September 29, 2017 - 10:30

Speaker: Margaret Mitchell, Google Seattle
Location: CSE 305
Abstract: Beginning with the philosophical, psychological, and cognitive underpinnings of referring expression generation, and ending with theoretical, algorithmic and applied contributions in mainstream vision-to-language research, I will discuss some of my work through the years towards the ultimate goal of helping humans and computers to communicate. This will be a multi-modal, multi-disciplinary talk (with pictures!), aimed to be interesting no matter what your background is.

Bio: Margaret Mitchell is a Senior Research Scientist in Google's Research & Machine Intelligence group, working on artificial intelligence. Her research generally involves vision-language and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI.

Monday, April 3, 2017 - 10:00

Speaker: Doug Oard (University of Maryland)
Location: EEB 303
The usual setup for Knowledge Base Population (KBP) is that we are initially given some collection of documents and (optionally) some incomplete knowledge base, and we are asked to produce a more complete knowledge base in which the set of entities, the attributes of those entities, and the relationships between those entities have been enriched based on information attested in the document collection. In this talk, we will describe our work with two prerequisites for KBP for content produced informally through conversation, as for example often happens in speech or email. One thing we would like to know if we are to update the right entity representations is who or what is being mentioned; this is an entity linking task. We’ll therefore begin our talk by describing our experiments with KBP from email, describing our work both with the well known Enron email collection and with the more recently created Avocado email collection that is now available for research use from LDC. One result of that reseach is that social features such as who is mentioning someone to whom are particularly useful. For that reason, we will next step back a level and consider the speaker identification problem in conversational telephone speech. Working with a set of more than one thousand recorded phone calls made or received by Enron energy traders, we have explored the use of two types of information from our Enron email knowledge base, and some additional social and channel features, to improve over the speaker recognition accuracy that we could achieve using only acoustic features. We’ll conclude the talk with a look ahead to see what needs to be done to bring these and other components together to actually populate knowledge bases from conversational sources. This is joint work with Tamer Elsayed, Mark Dredze and Greg Sell. Bio: Douglas Oard is a Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies and the University of Maryland Institute for Advanced Computer Studies (UMIACS). Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland. His research interests center around the use of emerging technologies to support information seeking by end users. Additional information is available at

Monday, February 27, 2017 - 10:00

Speaker: Yulia Tsvetkov (Stanford)
Location: CSE 305
One way to build more powerful, robust models and to provide deeper insight into data is to build hybrid models, integrating linguistic/social signals into statistical learning. I'll present model-based approaches that incorporate linguistic and social diversity into deep learning models to make them more robust and less biased toward particular languages, varieties, or demographics. First, I'll describe polyglot language models: recurrent neural networks that use shared annotations and representations in a single computational analyzer to process multiple languages, for the benefit of all, in particular those with insufficient training resources. Then, I’ll present an approach to integrating linguistic diversity into training data of non-convex models. The method optimizes linguistic content and structure of available training data to find a better curriculum for learning distributed representations of words. I’ll conclude with an overview of my current research which focuses on socially equitable NLP models: adversarial models that incorporate social diversity in the training objective to eliminate social biases hidden in data. Bio: Yulia Tsvetkov is a postdoc in the Stanford NLP Group, where she works on with professor Dan Jurafsky on NLP for social good. During her PhD in the Language Technologies Institute at Carnegie Mellon University, she worked on advancing machine learning techniques to tackle cross-lingual and cross-domain problems in natural language processing, focusing on computational phonology and morphology, distributional and lexical semantics, and statistical machine translation of both text and speech. In 2017, Yulia will join the Language Technologies Institute at CMU as an assistant professor.

Monday, February 13, 2017 - 10:00

Speaker: Julia Hockenmaier (UIUC)
Location: CSE 305
The task of automatic image captioning (i.e. associating images with sentences that describe what is depicted in them) has received a lot of attention over the last couple of years. But how well do these systems actually work? What are the limits of current approaches? In this talk, I will attempt to give an overview of how work in this area has developed. I will also highlight some shortcomings of current approaches, and discuss future directions. Bio: Julia Hockenmaier is associate professor in Computer Science at the University of Illinois at Urbana-Champaign. She works on natural language processing. Her current research focuses on automatic image description, statistical parsing and unsupervised grammar induction. Her group produced the popular Flickr30K dataset. She has given tutorials on image description at EACL and CVPR. Julia received her PhD from the University of Edinburgh and did postdoctoral work at the University of Pennsylvania. She has received an NSF CAREER award was shortlisted for the British Computer Society’s Distinguished Dissertation award.