Abstract: While deep neural network models have dramatically improved the quality of machine translation (MT), truly breaking language barriers requires not only translating accurately, but also comparing what is said and how it is said across languages. In this talk, I will argue that modeling divergences from common assumptions about the data used to model machine translation (MT) can not only improve MT, but also help broaden the framing of MT to make it more responsive to user needs. I will first discuss recent work on automatically detecting cross-lingual semantic divergences, which occur when translation does not preserve meaning [Vyas & Carpuat, EMNLP 2019]. Next, I will introduce a training objective for neural sequence-to-sequence models that accounts for divergences between MT model hypotheses and reference human translation [Xu, Niu & Carpuat, NAACL 2019]. Finally, I will argue that translation does not necessarily need to preserve all properties of the input and introduce a family of models that let us tailor translation style while preserving input meaning [Niu, Rao & Carpuat, COLING 2017; Agrawal & Carpuat, EMNLP 2019].

Bio: Marine Carpuat is an Assistant Professor in Computer Science at the University of Maryland. Her research focuses on multilingual natural language processing and machine translation. Before joining the faculty at Maryland, Marine was a Research Scientist at the National Research Council Canada. She received a PhD in Computer Science and a MPhil in Electrical Engineering from the Hong Kong University of Science & Technology, and a Diplome d'Ingenieur from the French Grande Ecole Supelec. Marine is the recipient of an NSF CAREER award, research awards from Google and Amazon, best paper awards at *SEM and TALN, and an Outstanding Teaching Award.

Marine Carpuat, University of Maryland
Thursday, October 31, 2019 - 10:30
CSE 305