Tuesday, November 14, 2023: 1:00 p.m.
Paul G. Allen Center for Computer Science & Engineering, Microsoft Atrium

Hanna Hajishirzi
Paul G. Allen School of Computer Science & Engineering


Open Language Model (OLMo):
The science of language models and language models for science

Abstract

Over the past few years, and especially since the deployment of ChatGPT in November 2022, neural language models with billions of parameters and trained on trillions of words are powering the fastest-growing computing applications in history and generating discussion and debate across society. However, AI scientists cannot study or improve those state-of-the-art models because the models' parameters, training data, code, and even documentation are not openly available. In this talk, I present our OLMo project toward building strong language models and making them fully open to researchers along with open-source code for data management, training, inference, and interaction. In particular, I describe DOLMa, a 3T token open dataset curated for training language models, Tulu, our instruction-tuned language model, and OLMo v1, a fully-open 7B parameter language model.

Bio

Hanna Hajishirzi is the Torode Family Associate Professor at UW CSE and a Senior Director of NLP at AI2. Her research spans different areas in NLP and AI, focusing on understanding, analyzing, and constructing language models. Honors include a Sloan fellowship, NSF CAREER Award, Allen Distinguished Investigator Award, Intel rising star award, UIUC alumni award, a best paper and an honorable mention paper award, and several industry research faculty awards. Hanna received her PhD from University of Illinois at Urbana-Champaign and spent a year as a postdoc at Disney Research and CMU.

This talk is viewable on our YouTube channel here.