Dietterich, T.G. and Bakiri, G. (1995)
"Solving Multiclass Learning Problems via Error-Correcting Output Codes",
Volume 2, pages 263-286.
Abstract:
Multiclass learning problems involve finding a definition
for an unknown function f(x) whose range is a discrete set
containing k > 2 values (i.e., k ``classes''). The
definition is acquired by studying collections of training examples of
the form [x_i, f (x_i)]. Existing approaches to
multiclass learning problems include direct application of multiclass
algorithms such as the decision-tree algorithms C4.5 and CART,
application of binary concept learning algorithms to learn individual
binary functions for each of the k classes, and application of
binary concept learning algorithms with distributed output
representations. This paper compares these three approaches to a new
technique in which error-correcting codes are employed as a
distributed output representation. We show that these output
representations improve the generalization performance of both C4.5
and backpropagation on a wide range of multiclass learning tasks. We
also demonstrate that this approach is robust with respect to changes
in the size of the training sample, the assignment of distributed
representations to particular classes, and the application of
overfitting avoidance techniques such as decision-tree pruning.
Finally, we show that---like the other methods---the error-correcting
code technique can provide reliable class probability estimates.
Taken together, these results demonstrate that error-correcting output
codes provide a general-purpose method for improving the performance
of inductive learning programs on multiclass problems.
Click here to return to the JAIR home page.