Artificial intelligence and machine learning communities have made tremendous strides in the last decade. Yet, the best systems to date still struggle with routine tests of human intelligence, such as standardized science exams posed as-is in natural language, even at the elementary-school level. Can we demonstrate human-like intelligence by building systems that can pass such tests? Unlike typical factoid-style question answering (QA) tasks, these tests challenge a student’s ability to combine multiple facts in various ways, and appeal to broad common-sense and science knowledge. Going beyond arguably shallow information retrieval (IR) and statistical correlation techniques, we view science QA from the lens of combinatorial optimization over a semi-formal knowledge base derived from text. Our structured inference system, formulated as an Integer Linear Program (ILP), turns out to be not only highly complementary to IR methods, but also more robust to question perturbation, as well as substantially more scalable and accurate than prior attempts using probabilistic first-order logic and Markov Logic Networks (MLNs). This talk will discuss fundamental challenges behind the science QA task, the progress we have made, and many challenges that lie ahead.

As a research scientist at the Allen Institute for AI (AI2), Ashish Sabharwal investigates scalable and robust methods for probabilistic and combinatorial inference, graphical models, and discrete optimization, especially as they apply to assessing machine intelligence through standardized exams in science and math. Prior to joining AI2, Ashish spent over three years at IBM Watson and five years at Cornell University, after obtaining his Ph.D. from the University of Washington in 2005. Ashish has co-authored over 70 publications, been part of winning teams in international reasoning competitions, and received five best paper awards and runner-up prizes at venues such as AAAI, IJCAI, and UAI.
Ashish Sabharwal (AI2)
Wednesday, March 2, 2016 - 16:30
EEB 045