The Hepatitis Prognosis dataset was donated by Gail Gong. Table 19 shows the test costs for the Hepatitis dataset. Unlike the other four datasets, this dataset deals with prognosis, not diagnosis. With prognosis, the diagnosis is known, and the problem is to determine the likely outcome of the disease. The tests that were assigned a nominal cost of $1.00 either involve asking a question to the patient or performing a basic physical examination on the patient. The tests in group A share the cost of $2.10 for collecting blood. Note that, although performing a histological examination of the liver costs $81.64, asking the patient whether a histology was performed only costs $1.00. Thus the prognosis can exploit the information conveyed by a decision (to perform a histological examination) that was made during the diagnosis. The class variable has the values 1 (die) and 2 (live). Table 20 shows the classification costs. The dataset contains 155 cases, with many missing values. In our ten random splits, the training sets had 103 cases and the testing sets had 52 cases. We filled in the missing values, using a simple single nearest neighbor algorithm (Aha et al., 1991). The missing values were filled in using the whole dataset, before the dataset was split into training and testing sets. For the nearest neighbor algorithm, the data were normalized so that the minimum value of a feature was 0 and the maximum value was 1. The distance measure used was the sum of the absolute values of the differences. The difference between two values was defined to be 1 if one or both of the two values was missing.