Michael J. Cafarella
|
Email: username mjc, found at cs dot washington dot edu.
Physical mail:
Mike Cafarella
University of Washington
Department of Computer Science and Engineering
Box 352350
Seattle, WA 98195-2350
Office: 482 Allen Center
|
I am a 5th-year graduate student at the
Department of Computer Science and Engineering
at the University of Washington. My research interests are databases, information retrieval and extraction, and machine learning/data mining. My recent work has focused on recovering and querying structured data found on the Web.
My advisors are Oren Etzioni and Dan Suciu. I've collaborated with many fellow students, most recently with Michele Banko, Chris Re, and Nodira Khoussainova. I have also completed two research projects at Google with Alon Halevy.
I've earned degrees from Brown and the University of Edinburgh, Scotland.
Before grad school, I worked at Marimba (later bought by BMC) and Tellme Networks (later bought by Microsoft). I also costarted the Nutch and Hadoop open source search projects with Doug Cutting (but the demands of grad school have finally pushed me to emeritus status).
Publications
2007
- Navigating Extracted Data with Schema Discovery. Michael J. Cafarella, Dan Suciu, Oren Etzioni. Proceedings of the Tenth International Workshop on the Web and Databases (WebDB), June 2007. Beijing, China.
- Structured Querying of Web Text: A Technical Challenge. Michael J. Cafarella, Christopher Re, Dan Suciu, Oren Etzioni, Michele Banko. Proceedings of the Conference on Innovative Data Systems Research (CIDR) 2007. Asilomar, CA.
- Open Information Extraction from the Web. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, Oren Etzioni. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), January 2007. Hyderabad, India.
2006
2005
- KnowItNow: Fast, Scalable Information Extraction from the Web. Michael J. Cafarella, Doug Downey, Stephen Soderland, and Oren Etzioni. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Vancouver, 2005.
- A Search Engine for Natural Language Applications. Michael J. Cafarella, Oren Etzioni. Proceedings of the 14th International World Wide Web Conference (WWW 2005).
- Unsupervised named-entity extraction from the Web: An experimental study. Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates. In Artificial Intelligence 165, pp. 91-134. 2005.
2004
- Methods for Domain-Independent Information Extraction
from the Web: An Experimental Comparison. Oren Etzioni, Michael Cafarella,
Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S.
Weld, Alexander Yates. Proceedings of AAAI 2004.
- Web-scale Information Extraction in KnowItAll.
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria
Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander
Yates. Proceedings of the 13th International World Wide Web Conference (WWW 2004).
- Building Nutch: Open Source Search by Mike Cafarella and Doug Cutting. ACM Queue, 2(2), April 2004.
Teaching
I TA'ed CSE454, Advanced Internet and Web Services in Winter '04 and Autumn '06. I really enjoyed helping to teach this class; if you're a UW student, give it a shot.
Personal
Last modified: April 17, 2008