My interests are broadly in the fields of databases and distributed systems. My current work focuses on data intensive scalable computing, scientific data management, cloud computing, and stream processing.
- Myria: Big Data Management as a Cloud Service.
In this project, we are building a new Big Data Management system as a cloud service and are studying the various associated technical challenges.
- Nuage:Data management in the cloud.
In this project, we are developing new techniques for handling large volumes of data using cloud-computing environments, with a special emphasis on scientific applications.
As part of Nuage, we are also collaborating on the SciDB project, which aims at building an array-based, parallel database management system for scientific research. The UW-local part of SciDB is described here.
- CQMS: Collaborative query management.
As scientists (and others) store, analyze, and share increasingly large volumes of data in data centers, they need tools to help them author, annotate, share, and reuse their data analysis queries. The goal of this project is to enable such support, which is lacking in commercial database management systems.
- Data Eco$y$tem: Data Management and Pricing in the Cloud.
We study problems at the intersection of pricing and data management in emerging cloud-computing environments.
- RFID Ecosystem: Experimenting with a pervasive RFID-based infrastructure.
- Lahar: Markovian Stream Processing.
- Moirae: Exploiting history in monitoring applications.
- PEEX: Probabilistic Event EXtractor for RFID data.
- FlowDB: Using relational databases in network forensic analysis.
- StreamClean: Cleaning sensor data.
- HomeViews: Helping home users organize and share their data.
- Distributed stream processing with Borealis and Medusa.
- Study of user mobility patterns and network utilization in a corporate WLAN.
- Twine: scalable resource discovery system for pervasive computing environments.
- Infranet: Internet censorship circumvention system.