Title: Self-improving Crowdsourcing

Advisors: Dan Weld and Mausam

Supervisory Committee: Dan Weld (co-Chair), Mausam (co-Chair), Andy Ko (GSR, iSchool), Dieter Fox, and Ece Kamar (MSR)

Abstract: Crowdsourcing enables data collection at scale, critical to accelerating scientific discovery and supporting a host of new machine learning applications. However, ensuring high quality data remains a key challenge, and one of central importance to downstream applications. Creating a successful crowdsourcing task requires significant time investment and iteration to ensure that workers understand the task and produce quality data. This cost underlies nearly every reported crowdsourcing success, yet it is seldom acknowledged and therefore often underestimated. The cost of this initial investment makes crowdsourcing impractical for all but the largest tasks; in many cases, it may actually be less costly for the task designer to simply perform the task herself.

In this talk, I will present my vision for a crowdsourcing system that reduces the burden of creating new tasks by using information gathered from the crowd, either explicitly or implicitly, to perform automatic task improvement. This system will seek to minimize interaction with the task designer ("requester"), whose sole interaction with the system is to provide gold answers and explanations. It will use machine learning and decision theory, along with information from the crowd, to automatically create effective instruction for workers, thereby reducing the up-front cost of creating successful crowdsourcing tasks. This work builds on our previous work on optimizing the amount of instruction by also considering the quality of instruction. Unlike unsuccessful attempts by other researchers to reduce requester involvement, this work will succeed by creating workflows that solve useful subgoals (e.g., finding a diverse and instructive set of questions) and focusing initially on improving worker performance on consensus tasks, a large class of crowdsourcing problems.

Place: 
CSE 303
When: 
Thursday, August 11, 2016 - 10:00 to 11:30