Michael J. Cafarella
|
Michael Cafarella
Assistant Professor
Computer Science and Engineering
2260 Hayward Ave.
Ann Arbor, MI 48109
Office: 4709 CSE
Phone: 734-764-9418
|
I am a new professor in the Software Systems Lab in Computer Science and Engineering at the University of Michigan. I'm also a member of the Database Group. My research interests are in databases, information extraction, and data mining. I am particularly interested in extracting and managing Web data.
Note To Students: I'm interested in taking on new students, both graduate students and undergrads, to work on research projects. If you're at Michigan, send me an email and we'll chat! If you're not currently a student at Michigan but would still like to work together, please read this short document.
Teaching
I'm teaching EECS485 in the Winter of 2010. Hope to see you there.
Publications
2009
2008
- Ontology-driven, Unsupervised Instance Population. Luke K. McDowell and Michael Cafarella. Journal of Web Semantics 6(3): 218-236, 2008.
- Uncovering the Relational Web. Michael J. Cafarella, Alon Halevy, Yang Zhang, Daisy Zhe Wang, Eugene Wu. Proceedings of the Eleventh International Workshop on the Web and Databases (WebDB), June 2008. Vancouver, Canada.
- WebTables: Exploring the Power of Tables on the Web. Michael J. Cafarella, Alon Halevy, Yang Zhang, Daisy Zhe Wang, Eugene Wu. Proceedings of VLDB 2008, August 2008. Auckland, New Zealand.
- Data Management Projects at Google. Michael Cafarella, Edward Chang, Andrew Fikes, Alon Halevy, Wilson Hsieh, Alberto Lerner, Jayant Madhavan, S. Muthukrishnan. SIGMOD Record, 37(1), 2008.
- Web-Scale Extraction of Structured Data. Michael J. Cafarella, Jayant Madhavan, Alon Halevy. SIGMOD Record 37(4): 55-61, 2008.
2007
- Navigating Extracted Data with Schema Discovery. Michael J. Cafarella, Dan Suciu, Oren Etzioni. Proceedings of the Tenth International Workshop on the Web and Databases (WebDB), June 2007. Beijing, China.
- Structured Querying of Web Text: A Technical Challenge. Michael J. Cafarella, Christopher Re, Dan Suciu, Oren Etzioni, Michele Banko. Proceedings of the Conference on Innovative Data Systems Research (CIDR) 2007. Asilomar, CA.
- Open Information Extraction from the Web. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, Oren Etzioni. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), January 2007. Hyderabad, India.
2006
2005
- KnowItNow: Fast, Scalable Information Extraction from the Web. Michael J. Cafarella, Doug Downey, Stephen Soderland, and Oren Etzioni. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Vancouver, 2005.
- A Search Engine for Natural Language Applications. Michael J. Cafarella, Oren Etzioni. Proceedings of the 14th International World Wide Web Conference (WWW 2005).
- Unsupervised named-entity extraction from the Web: An experimental study. Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates. In Artificial Intelligence 165, pp. 91-134. 2005.
2004
- Methods for Domain-Independent Information Extraction
from the Web: An Experimental Comparison. Oren Etzioni, Michael Cafarella,
Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S.
Weld, Alexander Yates. Proceedings of AAAI 2004.
- Web-scale Information Extraction in KnowItAll.
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria
Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander
Yates. Proceedings of the 13th International World Wide Web Conference (WWW 2004).
- Building Nutch: Open Source Search by Mike Cafarella and Doug Cutting. ACM Queue, 2(2), April 2004.
Short Bio
I earned my Ph.D. from the University of Washington in 2009, with advisors Oren Etzioni and Dan Suciu. I also worked with Alon Halevy at Google during an extended internship there. Before graduate school I worked for a couple of startups in California: Marimba, which did software distribution infrastructure; and Tellme Networks, which did (and does) voice recognition phone services. With Doug Cutting, I also costarted the Nutch and Hadoop open-source projects; I worked on them for many years but am no longer actively developing.
Miscellaneous
2009