Title: New Algorithmic Tools for Distributed Similarity Search

Advisor: Paul Beame

Supervisory Committee: Paul Beame (Chair), Jevin West (GSR, iSchool), Luis Ceze, Mark Oskin, and Magda Balazinska

Abstract: The task of finding similar vectors in a large dataset arises as a subroutine in many applcations, such as image seach and collaborative filtering.  I will describe some recently completed theoretical work that provides new upper and lower bounds on the communication required to report all close pairs in Hamming distance.  I will also propose two future projects, centering around the implementation of distributed algorithms for similarity search based on Hamming distance and Edit distance.  I will mention connections to edge-isoperimetry, distance correlations, DNA clustering, and metric embeddings.

Place: 
CSE 503
When: 
Wednesday, July 27, 2016 - 11:00 to 12:30