Title: Optimizing Systems using Machine Learning
Advisor: Arvind Krishnamurthy
Supervisory Committee: Arvind Krishnamurthy (Chair), Radha Poovendran (GSR, EE), Xi Wang, and Kevin Jamieson
Abstract: Computer systems comprise of many components that interact and cooperate with each other to perform certain task(s). Traditionally, many of these systems base their decisions on sets of rules or configurations defined by operators as well as handcrafted analytical models. However, creating those rules or engineering such models is a challenging task. First, the same system should be able to work under a combinatorial number of constraints on top of heterogeneous hardware. Second, they should support different type of workloads and run in potentially widely different settings. Third, they should be able to handle time-varying resource needs. These factors render reasoning about systems performance in general far from trivial.
In this thesis, we propose optimizing systems using Machine Learning techniques. By doing so, we aim to offload the burden of manually tuning rules and handcrafting complex analytical models from system designers, in order to bridge the gap of systems performance, and promote a generation of smarter systems that can learn from past experiences and improve their performance over time. In this talk, we present two systems that illustrate the impact of these ML-based optimizations.
First, we introduce ADARES, an adaptive system that dynamically adjusts virtual machine resources on-the-fly, namely virtual CPUs and memory, based on workload characteristics and other attributes of the virtualized environment, using Reinforcement Learning techniques. Then, we present CURATOR, a MapReduce-based framework for storage systems that safeguards the storage health and performance by executing background maintenance tasks, which we also schedule using RL.
Throughout this thesis, we present the instantiation of different ML models and (empirically) show how our formulations result in improved systems performance and efficiency. We propose pre-initializing model-free learners with historical traces to accelerate training, thus reducing the sample complexity. Our models can cope with heterogeneity in workloads, settings, and resources, as well as adapt to non-stationary dynamics.