UW MSR Summer Institute 2010
Cloud Data Services: Challenges and Opportunities
CSE logo
 CSE Home    Previous Institutes 

Daniel Abadi is an Assistant Professor at Yale University, with research interests in database system architecture and implementation, cloud computing, and the Semantic Web. He received his PhD from MIT in 2008, focusing in his thesis on query execution in column-store database systems. Abadi has been a recipient of a Churchill Scholarship, an NSF CAREER Award, the 2008 SIGMOD Jim Gray Doctoral Dissertation Award, and the 2007 VLDB best paper award.

Molham Aref

Shivnath Babu is an Assistant Professor of Computer Science at Duke University. He got his Ph.D. from Stanford University in 2005. He has received a U.S. National Science Foundation CAREER Award and three IBM Faculty Awards. His current research focuses on making data-intensive computing systems easier to manage.

Magda Balazinska is an Assistant Professor in the department of Computer Science and Engineering at the University of Washington. Magdalena's research interests are broadly in the fields of databases and distributed systems. Her current research focuses on data intensive scalable computing, sensor and scientific data management, and cloud computing. Magdalena holds a PhD from the Massachusetts Institute of Technology (2006). She is a Microsoft Research New Faculty Fellow (2007), received an NSF CAREER Award (2009), an HP Labs Research Innovation Award (2009-2011), a Rogel Faculty Support Award (2006), and a Microsoft Research Graduate Fellowship (2003-2005).

Roger Barga is an Architect in the Cloud Computing Futures team in Microsoft Research, where he leads a team responsible for developing tools and services on the Microsoft cloud computing platform to support data intensive research. Roger joined Microsoft in 1997 as a Researcher in the Database Group of Microsoft Research, where he worked on both systems research and product development efforts in database, workflow and stream processing systems. Roger served as Principal Architect of the External Research Division of External Research (MSR) from 2007 to 2009, prior to joining the cloud computing futures group. Roger has published over 50 peer reviewed papers, filed over 30 patent applications, and served more than 70 times on program committees for more than 30 different international conferences and workshops.

Philip A. Bernstein is a Principal Researcher at Microsoft Corporation. Over the past 35 years, he has been a product architect at Microsoft and at Digital Equipment Corp., a professor at Harvard University and Wang Institute of Graduate Studies, and a VP Software at Sequoia Systems. During that time, he has published over 150 papers and two books on the theory and implementation of database systems, especially on transaction processing and metadata management. The second edition of his book Transaction Processing, with Eric Newcomer, was published in June 2009. In the metadata field, he has worked on software repositories, schema mapping generation, schema evolution, lineage tracing, data translation, and data integration, primarily for commercial applications. His latest work focuses on database systems for cloud computing and on web search over structured data. He is an Editor-in-Chief of the Very Large Databases Journal, a member of the National Academy Board on Mathematical Sciences and Applications, and Treasurer of the Computing Research Association. He is an ACM Fellow, a winner of the SIGMOD Innovations Award, and a member of the Washington State Academy of Sciences and the National Academy of Engineering.

Bill Bolosky is a Principal Researcher in the Systems and Networking Area at Microsoft Research in Redmond, where he's been since finishing graduate school in 1992. He's interested in distributed systems, storage and operating systems. He's worked on numerous projects, ranging from the Mach operating system to the Tiger video server to the Farsite serverless file system.

Vinayak Borkar is the ASTERIX Project Lead Software Engineer in the Information Systems Group, Bren School of ICS, UC Irvine. Vinayak's interests include database systems, information integration, XML query processing, and data-intensive parallel computing.

Paul Brown's career has been limited to working for database companies whose names began with the letter 'I': Ingres, Illustra, Informix and IBM. After putting bugs into four or five commercial DBMS products, at IBM he was sensibly constrained to working on a diverse set of research projects ranging over XML data processing, schema integration, and algorithms for sampling. Following IBM he has helped found the array database startup SciDB/Zetics which is an attempt to build a system suitable for extremely large scale scientific data storage, processing and analysis.

Michael Cafarella is a professor of Computer Science at the University of Michigan. having received his Ph.D. from the University of Washington in 2009. His research interests include information extraction, Web data management, and nontraditional data management tools. He is a co-founder of the Nutch and Hadoop open-source projects.

Mike Carey is currently a Bren Professor of Information and Computer Sciences at UC Irvine. Prior to rejoining academia in 2008, he worked at BEA Systems as chief architect and an engineering director for the BEA AquaLogic Data Services Platform team. Carey also spent a number of years as a Professor at the University of Wisconsin-Madison, at IBM Almaden as a database researcher and manager, and as a Fellow at e-commerce software startup Propel Software. Carey is an ACM Fellow, a member of the National Academy of Engineering, and a recipient of the ACM SIGMOD Edgar F. Codd Innovations Award. His current research interests are centered around data-intensive computing and scalable data management.

Surajit Chaudhuri is a Principal Researcher and a Research Area Manager overseeing data management research activities at MSR, Redmond. His areas of interest include self-tuning database systems, query optimization, data cleaning, and synergy between web search and DBMS technologies. Along with his colleagues in his research group and in Microsoft SQL Server, he helped develop database tuning advisor (previously known as index tuning wizard) and data cleaning transforms for SQL Server. As his work outside of database research, he led the development of CMT, a conference management web service hosted by Microsoft Research since 1999 for the academic community. Surajit has a PhD from Stanford University and is an ACM Fellow. He was awarded the ACM SIGMOD Contributions Award in 2004.

Amr El Abbadi is currently Professor and Chair of the Computer Science Department at University of California, Santa Barbara. He received his B. Eng. in Computer Science from Alexandria University, Egypt, and received his Ph.D. in Computer Science from Cornell University in August 1987. In 2002-03, Prof. El Abbadi was director of the University of California, Education Abroad Center at the American University in Cairo. He has served as a board member of the VLDB Endowment from 2002-2008. In 2007, Prof. El Abbadi received the UCSB Senate Oustanding Mentorship Award for his excellence in mentoring graduate students. He has published over 250 articles in databases and distributed systems.

Nigel Ellis is a Distinguished Engineer in Microsoft's Business Platform Division, where he leads the development work for the SQL Azure initiative.

Ellis joined Microsoft in 1993 as a developer on the popular JET desktop database engine used by Microsoft Access. In 1995 he joined the SQL Server group to help with the re-write of the RDBMS query processor for SQL Server 7.0. Ellis led the query processing development work through several major releases. He has held a variety of roles within the SQL Server division, including product-level architect and engineering manager. In recent years he has focused on bringing database-as-a-service to the cloud as part of the SQL Azure project. Ellis is a native of the United Kingdom, and holds a Bachelor of Science (Hons) in computer science from University College, Durham in England. Outside of work, he is a fan of spicy foods and enjoys the outdoors through backpacking trips and mountain biking. Ellis lives in Redmond, Wash., with his wife Lynn and several four-legged friends.

Goetz Graefe is a HP Fellow working in the Intelligent Information Management Lab within Hewlett-Packard Laboratories. His experience and expertise are focused on relational database management systems, gained in academic research, industrial consulting, and industrial product development. His current research efforts focus on new hardware technologies in database management as well as robustness in database request processing in order to reduce total cost of ownership.

Prior to joining Hewlett-Packard Laboratories in 2006, Goetz spent 12 years as software architect in product development at Microsoft, mostly in database management. Both query optimization and query execution of Microsoft's re-implementation of SQL Server are based on his designs. Prior to Microsoft, Goetz taught and researched database implementation techniques in academic settings, where he supervised multiple Ph.D. candidates who later contributed to several products and companies in senior roles.

Goetz's areas of expertise within database management systems cover compile-time query optimization including extensible query optimization, run-time query execution including parallel query execution, indexing, and transactions. He has also worked on transactional memory, specifically techniques for software implementations of transactional memory. Goetz's research credentials include numerous original publications as well as surveys published by ACM Computing Surveys. His original publications cover query optimization, query execution, and indexing, the latter with particular focus on novel techniques for the ubiquitous data structure called B-trees. His work has been honored by the ACM SIGMOD 2000 "test of time" award for work on parallel query execution, by the IEEE ICDE 2005 "influential paper" award for work on extensible query execution, and by the 2009 ACM "software systems" award for participation in the Gamma database machine research project. He has originated numerous patent applications.

Jeff Hammerbacher is a founder and the Chief Scientist of Cloudera. Jeff was an Entrepreneur in Residence at Accel Partners immediately prior to Cloudera. Before Accel, he conceived, built, and led the Data team at Facebook. The Data team was responsible for driving many of the applications of statistics and machine learning at Facebook, as well as building out the infrastructure to support these tasks for massive data sets. The Data team produced open source projects such as Hive and Cassandra and their work was recognized at conferences such as CHI, ICWSM, SIGMOD, and VLDB. Before joining Facebook, Jeff was a quantitative analyst on Wall Street. Jeff earned his Bachelor's Degree in Mathematics from Harvard University and recently served as a Contributing Editor for O'Reilly's "Beautiful Data".

Bill Howe is a Senior Scientist at the eScience Institute at the University of Washington and an Affiliate Assistant Professor in the Computer Science and Engineering Department, also at UW. His research focuses on new database platforms for data-intensive science through awards from NSF, the Gordon and Betty Moore Foundation, Microsoft Research, and PNNL. Bill holds a Phd in Computer Science from Portland State University and a Bachelor's degree in Industrial and Systems Engineering from Georgia Tech.

Chris Jermaine studies how to build and use analytic databases. He is an associate professor of computer science at Rice University. He received a BA from the Mathematics Department at UCSD, an MSc from the Computer Science and Engineering Department at OSU, and a PhD from the College of Computing at Georgia Tech. Chris is the recipient of a 2008 Alfred P. Sloan Foundation Research Fellowship, a National Science Foundation CAREER award, and a 2007 ACM SIGMOD Best Paper Award.

Martin Kersten Kersten received his PhD in Computer Science from the Vrije Universiteit in 1985 on research in database security, whereafter he moved to CWI to established the Database Research Group. From 1979 until 1985 he developed a small relational kernel, called Troll, which was sold as part of a CASE tool 1985-1991. Between 1986 and 1991 he was co-designer of the PRISMA database machine, a RDBMS for a 100-node multiprocessor based on the assumption that the hotset is memory resident. In 1992 he initiated the development of his 3rd DBMS, called MonetDB, a column-store DBMS used world-wide and pivot in several national projects aimed at advanced database applications.

Currently he is heading a department involving 50 researchers in areas covering database architectures, multimedia information systems, information retrieval, and visualisation. Since 1994 he is professor at the University of Amsterdam. In 1995 he co-founded Data Distilleries, an SME specialized in data mining technology, and in 2008 co-founded MonetDB and VectorWise. His current research interests are parallel/distributed database architectures and query optimization in science database applications. He has published ca. 170 scientific papers. He acts as a reviewer for ESPRIT projects and is a trustee emeritis of the VLDB Endowment board.

Donald Kossmann is a professor for Computer Science at ETH Zurich (Switzerland). Furthermore, he is a co-founder of 28msec Inc., a start-up that develops an XML-based database system in the cloud. He received his MS in 1991 from the University of Karlsruhe and completed his PhD in 1995 at the Technical University of Aachen. After that, he held positions at the University of Maryland, the IBM Almaden Research Center, the University of Passau, the Technical University of Munich, and the University of Heidelberg. At ETH Zurich and 28msec, he develops new technologies at the intersection of database systems, web technologies, and distributed systems. Before joining ETH and 28msec, Donald Kossmann was a co-founder of i-TV-T AG (1998, still in business) and XQRL Inc. (founded in 2002 and acquired by BEA in the same year).

Tim Kraska is a PostDoc in the RAD Lab which is part of the Computer Science Division of UC Berkeley. Currently, his research focuses on data management for cloud-scale analytics, machine learning, and crowd sourcing in the context of the upcoming AMP Lab. Before joining UC Berkeley, Tim Kraska received his PhD from ETH Zurich, where he worked on transaction management and stream processing in the cloud at the Systems Group. He also holds a Master of Information Technology from the University of Sydney, Australia, as well as an MSc from the University of Muenster, Germany.

Per-Åke (Paul) Larson is a Principal Researcher at Microsoft Research. His primary research area is query optimization and query processing in database systems. Prior to joining Microsoft Research, he was a Professor in the Department of Computer Science at the University of Waterloo, Canada, for 15 years, serving as department chair for three years. He received his Ph.D. in 1976 from Åbo Akademi University in Finland. Dr. Larson is a Fellow of the ACM.

Jim Larus is a Director of the eXtreme Computing Group (XCG) in Microsoft Research. He has been an active contributor to the programming languages, compiler, and computer architecture communities. He has published many papers and served on numerous program committees and NSF and NRC panels. Larus became an ACM Fellow in 2006.

Larus joined Microsoft Research as a Senior Researcher in 1998 to start and, for five years, led the Software Productivity Tools (SPT) group, which developed and applied a variety of innovative techniques in static program analysis and constructed tools that found defects (bugs) in software. This group's research has both had considerable impact on the research community, as well as being shipped in Microsoft products such as the Static Driver Verifier and FX/Cop and other, widely-used internal software development tools. Larus then became the Research Area Manager for programming languages and tools and started the Singularity research project, which demonstrated that modern programming languages and software engineering techniques could fundamentally improve software architectures. Subsequently, he helped start XCG, which is developing the hardware and software to support cloud computing.

Before joining Microsoft, Larus was an Assistant and Associate Professor of Computer Science at the University of Wisconsin-Madison, where he published approximately 60 research papers and co-led the Wisconsin Wind Tunnel (WWT) research project with Professors Mark Hill and David Wood. WWT was a DARPA and NSF-funded project investigated new approaches to simulating, building, and programming parallel shared-memory computers. Larus's research spanned a number of areas: including new and efficient techniques for measuring and recording executing programs' behavior, tools for analyzing and manipulating compiled and linked programs, programming languages for parallel computing, tools for verifying program correctness, and techniques for compiler analysis and optimization.

Larus received his MS and PhD in Computer Science from the University of California, Berkeley in 1989, and an AB in Applied Mathematics from Harvard in 1980. At Berkeley, Larus developed one of the first systems to analyze Lisp programs and determine how to best execute them on a parallel computer.

Hank Levy is Chairman and Wissner-Slivka Chair of Computer Science & Engineering at the University of Washington. Levy's research involves operating systems, distributed systems, computer architecture, security, and the Web. Among his publications are 8 "best paper" awards from SOSP/OSDI (plus 7 more from other top conferences). With his UW colleagues, he invented Simultaneous Multithreading, which is used in a number of modern CPUs (e.g., Intel's "Hyperthreading"). Hank is a Fellow of the ACM, a Fellow of the IEEE, and recipient of a Fulbright Research Scholar Award.

Luke Lonergan is the Co-Founder and Chief Technology Officer at Greenplum. Prior to Greenplum, Luke founded Didera, a database clustering company, in 2000 and served as CEO and Chairman. Luke's background includes 16 years of management experience in computing technology ranging from innovations in supercomputing to advances in medical imaging systems. Most recently, he directed data center integration at High Performance Technologies Inc (HPTi), scaling the business to $30M, and setting industry firsts in parallel computing subsequently adopted by IBM and Compaq. Previously he held management positions at Northrop Grumman Corporation. He holds an M.S. in Aeronautics and Astronautics from Stanford University and a B.E. in Mathematics from Vanderbilt University.

David Maier is Maseeh Professor of Emerging Technologies at Portland State University. Prior to his current position, he was on the faculties at SUNY Stony Brook and Oregon Graduate Institute. He has spent extended visits with INRIA, University of Wisconsin - Madison, and Microsoft Research. He is the author of books on relational databases, logic programming and object-oriented databases, as well as papers in database theory, object-oriented technology, scientific databases and dataspace management. He received the Presidential Young Investigator Award from the National Science Foundation in 1984 and was awarded the 1997 SIGMOD Innovations Award for his contributions in objects and databases. He is also an ACM Fellow and IEEE Senior Member. He holds a dual B.A. in Mathematics and in Computer Science from the University of Oregon (Honors College, 1974) and a Ph.D. in Electrical Engineering and Computer Science from Princeton University (1978).

Vivek Narasayya is a Principal Researcher in the Data Management, Exploration and Mining (DMX) Group at Microsoft Research (MSR). His research interests include self-tuning databases, query optimization and processing and DBMS testing. Along with colleagues at MSR and the SQL Server product group, he developed the Database Engine Tuning Advisor, a physical database design tool that ships in Microsoft SQL Server. He was awarded the VLDB 10 year Best Paper Award in 2007. Vivek received his Ph.D. in Computer Science and Engineering from the University of Washington in 2000.

Raghu Ramakrishnan is Chief Scientist for Audience and Cloud Computing at Yahoo!, and is a Yahoo! Fellow, heading the Web Information Management research group. His work in database systems, with a focus on data mining, query optimization, and web-scale data management, has influenced query optimization in commercial database systems and the design of window functions in SQL:1999. His paper on the Birch clustering algorithm received the SIGMOD 10-Year Test-of-Time award, and he has written the widely-used text "Database Management Systems" (with Johannes Gehrke). Ramakrishnan has received several awards, including the ACM SIGKDD Innovations Award, the ACM SIGMOD Contributions Award, a Distinguished Alumnus Award from IIT Madras, a Packard Foundation Fellowship in Science and Engineering, and an NSF Presidential Young Investigator Award. He is a Fellow of the ACM and IEEE.

Ramakrishnan is on the Board of Directors of ACM SIGKDD, and is a past Chair of ACM SIGMOD and the Board of Trustees of the VLDB Endowment. He was Professor of Computer Sciences at the University of Wisconsin-Madison, and was founder and CTO of QUIQ, a company that pioneered question-answering communities, powering Ask Jeeves' AnswerPoint as well as customer-support for companies such as Compaq.

Balan Sethu Raman is a Distinguished Engineer in the Database Systems Group. He started his engineering career aspiring to be a hardware designer. Things took an interesting turn while building tools for hardware design, when he discovered that he found designing data structures to handle large amounts of data exciting. His career choice in software was cemented when he joined Microsoft.

As a software engineer in the Windows group Sethu worked on various aspects of file systems. His work ranged from developing network file systems and client side caching to distributed file systems. During this period he recognized the need for richer services to cope with the growing amounts of digital data, and he pioneered efforts across divisions to address this need. Sethu started working with the Microsoft SQL Server team to design better solutions for storing blobs in a database and offer richer services within the product's infrastructure. Since then he has been designing and developing features for building innovative applications to incorporate various kinds of data. Sethu soon recognized the importance of handling data streams because of the proliferation of data sources and the volumes of data being generated by them. He subsequently embarked on building an incubation effort for handling data streams in conjunction with Microsoft Research. He is currently focused on translating this into a product offering and manages the team responsible for it.

Outside of work Sethu keeps his curiosity alive by picking a new activity every year, pouring his energy into it, and becoming proficient at it. He can often be found running the trails near campus or developing his swimming skills. He also enjoys spending time with his wife and helping their son realize his dreams.

Berthold Reinwald joined the database group at the IBM Almaden Research Center in 1993. His current research areas include cloud database technology and scalable analytics platforms. He received his Ph.D. from the University of Erlangen-Nuernberg, Germany, in 1993.

Donovan Schneider is a Principal Architect at Salesforce.com where he is focused on real-time business intelligence for their multi-tenant software-as-a-service system. Donovan has over twenty years of experience in research and development in data warehousing, business intelligence, and parallel processing. He has worked at innovative startups like Red Brick and nQuire, as well as Siebel, Yahoo!, and HP-Labs. Dr. Schneider has a PhD in Computer Science from the University of Wisconsin - Madison. He is a recipient of the 2008 ACM Software Systems award for work on the Gamma parallel database system.

Doug Terry is a Principal Researcher in the Microsoft Research Silicon Valley Lab. His research focuses on the design and implementation of novel distributed systems and addresses issues such as information management, fault-tolerance, weakly consistent replication, and mobility. He also serves as Chair of ACM's Special Interest Group on Operating Systems (SIGOPS) and was recently elected to the ACM Council. Prior to joining Microsoft, Doug was the co-founder and CTO of a start-up company named Cogenia, Chief Scientist of the Computer Science Laboratory at Xerox PARC, and an Adjunct Professor in the Computer Science Division at U. C. Berkeley, where he regularly teaches a graduate course on distributed systems. Doug has a Ph.D. in Computer Science from U. C. Berkeley and is an ACM Fellow.

Jeff Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus). His interests include database theory, database integration, data mining, and education using the information infrastructure.

Ullman has been working with Foto Afrati on algorithms that exploit the map-reduce environment (often called distributed file systems) but that are not necessarily constrained by the limitations of Map processes feeding Reduce processes. A Optimizing Joins in a Map-Reduce Environment looks at the multiway join in particular, and shows how to find the optimum way to distribute the work among Reduce processes. It appears in EDBT 2010. A New Computation Model for Cluster Computing is a rewrite of something that was rejected from PODS 2009. We have submitted it to PODS 2010, and we hope the PC will recognize the importance of what we are trying to do: provide a model in which one can talk about the relative merits of algorithms for execution on Hadoop, Clustera, or similar programming systems. (Added later: they again missed the point.)

Haixun Wang

Yuan Yu is a principal researcher at Microsoft Research Silicon Valley lab, where he currently works on systems and programming models for large-scale parallel and distributed computing. At Microsoft, Yuan is involved in the development of Dryad and DryadLINQ. Prior to joining Microsoft, he was a senior member of technical staff at the DEC/Compaq Systems Research Center. Yuan has a PhD in Applied Mathematics from University of Texas at Austin.

Stan Zdonik is a Professor on the Computer Science at Brown University since 1983. He has been involved in a diverse set of topics including object-oriented database systems, semantic query optimization, network information systems, data management for mobile systems, data dissemination, stream processing, column stores, and more recently, large-scale data management for science. This work has been published in highly-competitive database conferences and journals. Support for this work has come from many government agencies and industrial sources including the National Science Foundation (NSF), the Defense Advanced Projects, Research Agency (DARPA), IBM, Digital Equipment Corporation (DEC), Apple Computer, Intel, and Sun Microsystems. He is a co-founder of both Streambase and Vertica.

He served as a member of the VLDB Endowment, an editor of several academic journals, and has been Program Chair for both the VLDB and the ICDE database conferences and General Co-Chair for SIGMOD 2009. He was the sole recipient of the prestigious Office of Naval Research Young Investigator Award in computer science in 1986, and he is an ACM Fellow. He received his PhD from MIT in 1983. In the mid-seventies, he worked on an advanced data management system for pharmacologists at Bolt, Beranek, and Newman, Inc. in Cambridge, MA, and he has been a consultant to many major US corporations.

Jingren Zhou is a member of the Bing Search Infrastructure Team. Zhou manages a team to design and develop distributed query processing and optimization (SCOPE/Cosmos) on large scale data clusters of shared-nothing commodity servers. Previously, he was a researcher in the Database Group at Microsoft Research, part of the Microsoft Corporation. Zhou's research is in the area of database, in particular query processing, query optimization, large scale distributed computing, and architecture-conscious database systems. Before joining Microsoft, he obtained his Ph.D. in Computer Science at Columbia University and B.S. at University of Science and Technology of China.

Last updated: 2 August 2010

Computer Science & Engineering Box 352350, University of Washington Seattle, WA 98195-2350 Privacy and terms