Title: Speeding up gradient boosting for training and prediction

Advisors: Carlos Guestrin and Arvind Krishnamurthy

Abstract: Gradient boosting [3] is one of most popular algorithms due to its simplicity and predictive performance. The algorithm produces an ensemble of decision trees, which has many desirable properties as a statistical model. XGBoost [1] is a fast, scalable implementation of gradient boosting that has found a wide following among hobbyists and practitioners. This paper presents our recent contribution for fast training and prediction. Using a series of optimizations such as feature quantization and histogram aggregation of gradients, we accelerate training by 2-4 times without loss of model accuracy. We add support for multiple tree-growing strategies, allowing for potentially unbalanced trees in pursuit of faster loss reduction. In addition, we add a dedicated framework for prediction, in which XGBoost emits generated code that can be used in isolation by user applications. Applications would no longer have to link XGBoost in order to make predictions.

CSE 303
Friday, May 19, 2017 - 09:00 to 10:30