Abstract: The GLUE and SuperGLUE shared-task benchmarks aim to measure progress toward the goal of building general-purpose pretrained neural network models for language understanding. This goal turns out to have been widely shared, and these benchmarks have become a significant target for research in the NLP and machine learning communities. In this talk, I'll review the motivations behind these benchmarks and what these benchmarks can tell us about recent progress in NLP, and raise a few (open!) questions about how we should measure further progress in this area.
Bio: Sam Bowman has been an assistant professor at NYU since 2016, when he completed PhD with Chris Manning and Chris Potts at Stanford. At NYU, Sam is jointly appointed between the new school-level Center for Data Science, which focuses on machine learning, and the Department of Linguistics. Sam's research focuses on data, evaluation techniques, and modeling techniques for sentence and paragraph understanding in natural language processing, and on applications of machine learning to scientific questions in linguistic syntax and semantics. Sam organized a twenty-three person research team at JSALT 2018 and received a 2015 EMNLP Best Resource Paper Award, a 2017 Google Faculty Research Award, and a 2019 *SEM Best Paper Award.