The Entropy (Disorder) of a Collection
Suppose S is a collection containing positive and negative examples of the target concept:
- Entropy(S) ? – (p+ log2 p+ + p- log2 p-)
- where p+ is the fraction of examples that are positive and p- is the fraction of examples that are negative
Good features
- minimum of 0 where p+ = 0 and where p- = 0
- maximum where p+ = p- = 0.5
Interpretation: the minimum number of bits required to encode the classification of an arbitrary member of S.
We want to reduce the entropy in the collection as quickly as possible.