Journal of Artificial Intelligence Research 2 (1995) 369-409
Submitted 10/94; published 3/95
(c) 1995 National Research Council Canada. All rights reserved.
Published by permission.
Cost-Sensitive Classification: Empirical Evaluation
of a Hybrid Genetic Decision Tree Induction Algorithm
Peter D. Turney
(peter@ai.iit.nrc.ca)
Knowledge Systems Laboratory
Institute for Information Technology
National Research Council Canada
Ottawa, Ontario, Canada, K1A 0R6
This paper introduces ICET, a new algorithm for cost-sensitive
classification. ICET uses a genetic algorithm to evolve a population of
biases for a decision tree induction algorithm. The fitness function
of the genetic algorithm is the average cost of classification when
using the decision tree, including both the costs of tests (features,
measurements) and the costs of classification errors. ICET is compared
here with three other algorithms for cost-sensitive classification --
EG2, CS-ID3, and IDX -- and also with C4.5, which classifies without
regard to cost. The five algorithms are evaluated empirically on five
real-world medical datasets. Three sets of experiments are performed.
The first set examines the baseline performance of the five algorithms
on the five datasets and establishes that ICET performs significantly
better than its competitors. The second set tests the robustness of
ICET under a variety of conditions and shows that ICET maintains its
advantage. The third set looks at ICET's search in bias space and
discovers a way to improve the search.
Return to the
JAIR home page.