My interests are broadly in the fields of databases and distributed systems. My current work focuses on data management for data science, cloud computing, image and video analytics, database systems for VR/AR, and machine learning + databases.
- LightDB: Data management system for video data. We are building a new data management system for virtual reality videos and other video applications. Our work includes building a new benchmark, called Visual Road, for this type of workloads.
- Image analytics: Database support for machine learning workloads. We are developing new data management techniques to support novel workloads, including machine learning. Our focus so far has been on accelerating and better supporting deep learning over scientific image databases.
- DeepQuery: Machine learning for database systems. We study how machine learning, including deep learning, can serve to improve data management systems. This project also included the Cuttlefish work that applied reinforcement learning to adaptive query processing.
- Themis: Open world data management and analytics system: We are developing a new type of data management and analytics system designed to work with samples of real-world data, yet answer analytical queries about that world. This work also includes work on query optimization and the EntropyDB system.
- Myria: Big data management as a cloud service. In this project, we are building a new big data management system as a cloud service and are studying the various associated technical challenges..
- Data Eco$y$tem: Data Management and Pricing in the Cloud.
- CQMS: Collaborative query management.
- Nuage: Data management in the cloud (first project).
- SciDB: Array data management. The UW-local part of SciDB is described here.
- RFID Ecosystem: Experimenting with a pervasive RFID-based infrastructure.
- Lahar: Markovian Stream Processing.
- Moirae: Exploiting history in monitoring applications.
- PEEX: Probabilistic Event EXtractor for RFID data.
- FlowDB: Using relational databases in network forensic analysis.
- StreamClean: Cleaning sensor data.
- HomeViews: Helping home users organize and share their data.
- Distributed stream processing with Borealis and Medusa.
- Study of user mobility patterns and network utilization in a corporate WLAN.
- Twine: scalable resource discovery system for pervasive computing environments.
- Infranet: Internet censorship circumvention system.