How can we avoid overfitting?
stop growing when data split not statistically significant grow full tree, then post-prune
How to select ``best'' tree: Measure performance over training data
Measure performance over separate validation data set
MDL: minimize