Talks | Paul G. Allen School of Computer Science & Engineering

Tuesday, October 1, 2019 - 10:30

Teaching Machines with Humans in the Loop

Speaker: Sanja Fidler, University of Toronto / NVIDIA

Location: CSE 305

Abstract: The most natural way for an artificial agent to communicate with a human is through language. Language allows an agent to convey what it is seeing, what its internal goals are, ask questions about concepts that it is uncertain about, and possibly engage in a conversation. Similarly, a human can use language to teach an agent new concepts, describe instructions for tasks that the agent should perform, and possibly give feedback to the agent by describing its mistakes. In this talk, I will describe our recent work in this domain.

Bio: Sanja Fidler is an Assistant Professor at the Department of Computer Science, University of Toronto. She joined UofT in 2014. In 2018, she took a role of Director of AI at NVIDIA, leading a research lab in Toronto. Previously she was a Research Assistant Professor at TTI-Chicago, a philanthropically endowed academic institute located in the campus of the University of Chicago. She completed her PhD in computer science at University of Ljubljana in 2010, and was a postdoctoral fellow at University of Toronto during 2011-2012. In 2010 she visited UC Berkeley as a visiting research scientist. She has served as a Program Chair of the 3DV conference, and as an Area Chair of CVPR, ICCV, EMNLP, ICLR, NIPS, and AAAI, and will serve as Program Chair of ICCV'21. She received the NVIDIA Pioneer of AI award, Amazon Academic Research Award, Facebook Faculty Award, and the Connaught New Researcher Award. In 2018 she was appointed as the Canadian CIFAR AI Chair. She has also been ranked among the top 3 most influential AI female researchers in Canada by Re-WORK. Her work on semi-automatic object instance annotation won the Best Paper Honorable Mention at CVPR’17. Her main research interests are scene parsing from images and videos, interactive annotation, 3D scene understanding, 3D content creation, and multimodal representations.

Wednesday, August 7, 2019 - 10:30

TBD (regarding Open-Domain QA)

Speaker: Peng Qi

Location: Allen 305

Abstract: TBD.

Bio: Peng Qi is a PhD student in Computer Science at Stanford University. His research interests revolve around building natural language processing systems that better bridge between humans and the large amount of (textual) information we are engulfed in. Specifically, he is interested in building knowledge representations, (open-domain) question answering, explainability, and multi-lingual NLP. He is also interested in linguistics, and builds tools for linguistic structure analysis applicable to many languages.

Friday, June 28, 2019 - 15:30

Towards Precise and Robust Text Generation

Speaker: Ankur Parikh

Location: Allen 305

Abstract: Despite large advances in neural text generation in terms of fluency, existing generation techniques are prone to hallucination and often produce output that is factually incorrect or structurally incoherent. In this talk, we study this problem from the perspectives of evaluation, modeling, and robustness. We first discuss how existing evaluation metrics like BLEU or ROUGE show poor correlation to human judgement when the reference text diverges from information in the source, a common phenomena in generation datasets. We propose a new metric, PARENT, which aligns n-grams to the source before computing precision and recall, making it considerably more robust to divergence. Next, we discuss modeling, proposing an exemplar-based approach to conditional text generation that aims to leverage training instances to build instance-specific decoders that can more easily capture style and structure. Results on 3 datasets show that our model achieves strong performance and outperforms comparable baselines. Lastly, we discuss generalization of neural generation in non-iid settings, focusing on the problem of zero shot translation — a challenging setup that tests models on translation directions they have not been optimized for at training time. We define the notion of zero-shot consistency and introduce a consistent agreement-based training method that results in a 2-3 BLEU zero shot improvement over strong baselines.

Collaborators: This is joint work with Maruan Al-Shedivat, Bhuwan Dhingra, Hao Peng, Manaal Faruqui, Ming-Wei Chang, William Cohen, and Dipanjan Das.

Bio: Ankur Parikh is a Senior Research Scientist at Google NYC and adjunct assistant professor at NYU. His primary interests are in natural language processing and machine learning. Ankur received his PhD from Carnegie Mellon in 2015 and his B.S.E. from Princeton University in 2009. He has received a best paper runner up award at EMNLP 2014 and a best paper in translational bioinformatics at ISMB 2011.

Thursday, December 13, 2018 - 10:30

Learning, Representing, and Understanding Language

Speaker: Aida Nematzadeh

Location: CSE 305

Abstract: Language is one of the greatest puzzles of both human and artificial intelligence (AI). Children learn and understand their language effortlessly; yet, we do not fully understand how they do so. Moreover, although access to more data and computation has resulted in recent advances in AI systems, they are still far from human performance in many language tasks. In my research, I try to address two broad questions: how do humans learn, represent, and understand language? And how can this inform AI? In the first part of my talk, I show how computational modeling can help us understand the mechanisms underlying child word learning. I introduce an unsupervised model that learns word meanings using general cognitive mechanisms; this model processes data that approximates child input and assumes no built-in linguistic knowledge. Next, I explain how cognitive science of language can help us examine current AI models and develop improved ones. In particular, I focus on how investigating human semantic processing helps us model semantic representations more accurately. Finally, I explain how we can use experiments in theory-of-mind to examine question-answering models with respect to reasoning capacity about beliefs.

Bio: Aida Nematzadeh is a research scientist at DeepMind. Previously she was a postdoctoral researcher at UC Berkeley affiliated with the Computational Cognitive Science Lab and BAIR. She received a PhD and an MSc in Computer Science from the University of Toronto. Her research interests lie in the intersection of cognitive science, computational linguistics, and machine learning.

Tuesday, November 27, 2018 - 10:30

Expectation-based syntactic processing in humans and machines

Speaker: Roger Levy

Location: CSE 305

Abstract:
Psycholinguistics and computational linguistics are the two fields most dedicated to accounting for the computational operations required to understand natural language. Today, both fields find themselves responsible for understanding the behaviors and inductive biases of "black-box" systems: the human mind and artificial neural network (ANN) models, respectively. In this talk I highlight how the two fields can productively contribute to one another, with a focus on the study of syntactic processing. I first describe the surprisal theory of interpretation, prediction, and differential processing difficulty in human language comprehension. I then show how surprisal theory and controlled experimental paradigms from psycholinguistics can help us probe ANN language model behavior for evidence of human-like grammatical generalizations. We find that ANNs exhibit a range of subtle behaviors, including embedding-depth tracking and garden-pathing over long stretches of text, that suggest representations homologous to incremental syntactic state in human language processing. These ANNs also learn abstract word-order preferences and many generalizations about the long-distance filler-gap dependencies that are a hallmark of natural language syntax, perhaps most surprisingly including many filler-gap "island" constraints. However, even when trained on a human lifetime's worth of linguistic input these ANNs fail to learn a number of key basic facts about other core grammatical dependencies. Finally, I comment on the respects in which the departures of recurrent neural network language models from the predictions of the "competence" grammars developed in generative linguistics might provide a "performance" account of human language processing -- and on the respects in which they might not.

Bio:
Roger Levy is a professor in the Department of Brain and Cognitive Sciences at the Massachusetts Institute of Technology. He directs MIT's Computational Psycholinguistics Laboratory. Before joining MIT, he was faculty in the Department of Linguistics at UC San Diego and received his PhD in Linguistics from Stanford University. His research focuses on theoretical and applied questions in the processing and acquisition of natural language.

Tuesday, November 20, 2018 - 10:30

Maximum Mutual Information Predictive Coding --- a Path to Semantics?

Speaker: David McAllester

Location: CSE 305

Abstract: Language modeling, in the form of ELMO and BERT, yields useful pre-trained task-tunable models. This success of unsupervised pre-training distinguishes language, with its discrete signals, from continuous-signal domains such as speech and vision. Until now the most effective pre-trained task-tunable models in continuous-signal domains have been generated from human-annotated data. This talk will discuss maximum mutual information (MMI) predictive coding as a unifying framework for unsupervised training in both continuous and discrete domains. This talk will describe the motivation, the mathematics, and some very promising computer vision results out of Montreal. This talk will also present a speculative discussion of MMI predictive coding as a language modeling approach to natural language semantics.

Bio: Professor McAllester received his B.S., M.S., and Ph.D. degrees from the Massachusetts Institute of Technology in 1978, 1979, and 1987 respectively. He served on the faculty of Cornell University for the academic year of 1987-1988 and served on the faculty of MIT from 1988 to 1995. He was a member of technical staff at AT&T Labs-Research from 1995 to 2002. He has been a fellow of the Association for the Advancement of Artificial Intelligence (AAAI) since 1997. From 2002 to 2017 he was Chief Academic Officer at the Toyota Technological Institute at Chicago (TTIC) where he is currently a Professor. He has received two 20 year "test of time" awards --- for a paper on systematic nonlinear planning at the AAAI conference and for a paper on interval methods for constraint solving at the International Conference of Logic Programming.

Tuesday, November 13, 2018 - 10:30

Segmentation-based Sequence Modeling and Its Applications

Speaker: Chong Wang

Location: CSE 305

Abstract: In this talk, I will present a general framework for sequence modeling by exploring the segmental structures of the sequences. We first observe that segmental structure is a common pattern in many types of sequences, e.g., phrases in human languages. We then design a probabilistic model that is able to consider all valid segmentations for a sequence. We describe an efficient and exact dynamic programming algorithm for forward and backward computations. Due to the generality, it can be used as a loss function in many sequence tasks. We demonstrate our approach on text segmentation, speech recognition, machine translation and dialog policy learning. In addition to quantitative results, we also show that our approach can discover meaningful segments in their respective application contexts. (This is a joint work with many of my previous and current collaborators.)

Bio: Chong Wang is a research scientist at Google. Before Google, He worked at Microsoft Research and Baidu Silicon Valley AI Lab. He received his PhD from Princeton University. His research interests include machine learning and their applications to speech, translation and natural language understanding. He has won several best paper awards in top machine learning conferences and some of his work went into widely used products to serve the users from the globe. His homepage is https://chongw.github.io

Tuesday, October 16, 2018 - 10:30

Rebooting AI

Speaker: Gary Marcus

Location: Gates Commons

Abstract:
A review of recent advances in AI, and why, despite genuine progress, we may not be on the right track towards general AI, followed by some very tentative discussion about what might be better.

Bio:
Gary Marcus, scientist, bestselling author, and entrepreneur was CEO and Founder of the machine learning startup Geometric Intelligence, recently acquired by Uber.

As a Professor of Psychology and Neural Science at NYU, he has published extensively in fields ranging from human and animal behavior to neuroscience, genetics, and artificial intelligence, often in leading journals such as Science and Nature.

Tuesday, October 9, 2018 - 10:30

Universal Information Extraction

Speaker: Heng Ji

Location: CSE 305

Abstract:

The big data boom in recent years covers a wide spectrum of heterogeneous data types, from text to image, video, speech, and multimedia. Most of the valuable information in such "big data" is encoded in natural language, which makes it accessible to some people—for example, those who can read that particular language—but much less amenable to computer processing beyond a simple keyword search.

My focused research area, cross-source Information Extraction (IE) on a massive scale, aims to create the next generation of information access in which humans can communicate with computers in any natural language beyond keyword search, and computers can discover accurate, concise, and trustable information embedded in big data from heterogeneous sources.

The goal of Information Extraction (IE) is to extract structured facts from a wide spectrum of heterogeneous unstructured data types. Traditional IE techniques are limited to a certain source X (X = a particular language, domain, limited number of pre-defined fact types, single data modality, ...). When moving from X to a new source Y, we need to start from scratch again by annotating a substantial amount of training data and developing Y-specific extraction capabilities.

In this talk, I will present a new Universal IE paradigm to combine the merits of traditional IE (high quality and fine granularity) and Open IE (high scalability). This framework is able to discover schemas and extract facts from any input data in any domain, without any annotated training data, by integrating distributional semantics and symbolic semantics. It can also be extended to thousands of languages, thousands of fact types and multiple data modalities (text, images, videos) by constructing a multi-lingual multi-media multi-task common semantic space and then performing zero-shot transfer learning across sources. The resulting system was selected for DARPA 60 Anniversary.

Bio:

Heng Ji is the Edward P. Hamilton Chair Professor in Computer Science at Rensselaer Polytechnic Institute. She received her Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016, 2017 and 2018. She received "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, Google Research Awards in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014, and Bosch Research Awards in 2015, 2016 and 2017. She coordinated the NIST TAC Knowledge Base Population task since 2010, and led various government sponsored research projects, including the DARPA DEFT TinkerBell team and the ARL NS-CTA Knowledge Networks Construction task. She has served as a panelist for US Air Force 2030, and the Program Committee Co-Chair of several conferences including NAACL-HLT2018.

Saturday, July 21, 2018 - 06:40

Neural and symbolic semantic parsing with typed graph algebras

Speaker: Alexander Koller

Location: CSE 691 (Gates Commons)

Much recent research on semantic parsing has focused on learning to map natural-language sentences to graphs which represent the meaning of the sentence, such as Abstract Meaning Representations (AMRs) and MRS graphs. In this talk, I will discuss methods for semantic parsing into graphs which aim to make the compositional structure of the semantic representations explicit. This connects semantic parsing to a fundamental principle of linguistic semantics and should improve generalization to unseen data, improving accuracy.

I will first introduce two graph algebras - the HR algebra from the theory literature and our own apply-modify (AM) algebra -, and show how to define symbolic grammars that map between strings and graphs using these algebras. Compared to the HR algebra, the AM algebra drastically reduces the number of possible compositional structures for a given graph, but it still permits linguistically plausible analyses for a variety of nontrivial semantic phenomena.

I will then report on a neural semantic parser which learns to map sentences into terms over the AM algebra. This semantic parser combines a neural supertagger (which predicts elementary graphs for each word in the sentence) with a neural dependency parser (which predicts the structure of the AM terms). By constraining the search to AM terms which also satisfy certain simple type constraints, we achieve state-of-the-art (pre-ACL) accuracy in AMR parsing. One advantage of the model is that it generalizes neatly to other semantic parsing problems, such as semantic parsing into MRS or DRT.

Bio

Alexander Koller is a Professor of Computational Linguistics in the Department of Language Science and Technology and Saarland University. He also holds a joint appointment with Facebook AI Research.