Table of Contents

4.2.3 Complex Classification Cost Matrices


4.2.4 Poorly Estimated Classification Cost

We believe that it is an advantage of ICET that it is sensitive to both test costs and classification error costs. However, it might be argued that it is difficult to calculate the cost of classification errors in many real-world applications. Thus it is possible that an algorithm that ignores the cost of classification errors (e.g., EG2, CS-ID3, IDX) may be more robust and useful than an algorithm that is sensitive to classification errors (e.g., ICET). To address this possibility, we examine what happens when ICET is trained with a certain penalty for classification errors, then tested with a different penalty.

Our hypothesis was that ICET would be robust to reasonable differences between the penalty during training and the penalty during testing. Table 11 shows what happens when ICET is trained with a penalty of $100 for classification errors, then tested with penalties of $50, $100, and $500. We see that ICET has the best performance of the five algorithms, although its edge is quite slight in the case where the penalty is $500 during testing.

We also examined what happens (1) when ICET is trained with a penalty of $500 and tested with penalties of $100, $500, and $1,000 and (2) when ICET is trained with a penalty of $1,000 and tested with penalties of $500, $1,000, and $5,000. The results show essentially the same pattern as in Table 11: ICET is relatively robust to differences between the training and testing penalties, at least when the penalties have the same order of magnitude. This suggests that ICET is applicable even in those situations where the reliability of the estimate of the cost of classification errors is dubious.

When the penalty for errors on the testing set is $100, ICET works best when the penalty for errors on the training set is also $100. When the penalty for errors on the testing set is $500, ICET works best when the penalty for errors on the training set is also $500. When the penalty for errors on the testing set is $1,000, ICET works best when the penalty for errors on the training set is $500. This suggests that there might be an advantage in some situations to underestimating the penalty for errors during training. In other, words ICET may have a tendency to overestimate the benefits of tests (this is likely due to overfitting the training data).


4.3 Searching Bias Space