All Packages Class Hierarchy This Package Previous Next Index WEKA's home
Class weka.classifiers.CostMatrix
java.lang.Object
|
+----weka.core.Matrix
|
+----weka.classifiers.CostMatrix
- public class CostMatrix
- extends Matrix
Class for a misclassification cost matrix. The element in the i'th column
of the j'th row is the cost for (mis)classifying an instance of class j as
having class i. It is valid to have non-zero values down the diagonal
(these are typically negative to indicate some varying degree of "gain"
from making a correct prediction).
- Version:
- $Revision: 1.8 $
- Author:
- Len Trigg (len@intelligenesis.net)
FILE_EXTENSION- The filename extension that should be used for cost files
CostMatrix(CostMatrix)
- Creates a cost matrix identical to an existing matrix.
CostMatrix(int)
- Creates a default cost matrix for the given number of classes.
CostMatrix(Reader)
- Creates a cost matrix from a cost file.
applyCostMatrix(Instances, Random)
- Changes the dataset to reflect a given set of costs.
expectedCosts(double[])
- Calculates the expected misclassification cost for each possible
class value, given class probability estimates.
getMaxCost(int)
- Gets the maximum misclassification cost possible for a given actual
class value
initialize()
- Sets the costs to default values (i.e.
main(String[])
- Tests out creation of a frequency dependent cost matrix from the command
line.
makeFrequencyDependentMatrix(Instances, double)
- Creates a cost matrix for the class attribute of the supplied instances,
where the misclassification costs are higher for misclassifying a rare
class as a frequent one.
normalize()
- Normalizes the cost matrix so that diagonal elements are zero.
readOldFormat(Reader)
- Reads misclassification cost matrix from given reader.
size()
- Gets the number of classes.
FILE_EXTENSION
public static java.lang.String FILE_EXTENSION
The filename extension that should be used for cost files
CostMatrix
public CostMatrix(CostMatrix toCopy)
Creates a cost matrix identical to an existing matrix.
- Parameters:
toCopy
- the matrix to copy.
CostMatrix
public CostMatrix(int numClasses)
Creates a default cost matrix for the given number of classes. The
default misclassification cost is 1.
- Parameters:
numClasses
- the number of classes
CostMatrix
public CostMatrix(java.io.Reader r) throws java.lang.Exception
Creates a cost matrix from a cost file.
- Parameters:
r
- a reader from which the cost matrix will be read
- Throws:
- java.lang.Exception - if an error occurs
makeFrequencyDependentMatrix
public static CostMatrix makeFrequencyDependentMatrix(Instances instances,
double weight) throws java.lang.Exception
Creates a cost matrix for the class attribute of the supplied instances,
where the misclassification costs are higher for misclassifying a rare
class as a frequent one. The cost of classifying an instance of class i
as class j is weight * Pj / Pi. (Pi and Pj are laplace estimates)
- Parameters:
instances
- a value of type 'Instances'
weight
- a value of type 'double'
- Returns:
- a value of type CostMatrix
- Throws:
- java.lang.Exception - if no class attribute is assigned, or the class
attribute is not nominal
readOldFormat
public void readOldFormat(java.io.Reader reader) throws java.lang.Exception
Reads misclassification cost matrix from given reader.
Each line has to contain three numbers: the index of the true
class, the index of the incorrectly assigned class, and the
weight, separated by white space characters. Comments can be
appended to the end of a line by using the '%' character.
- Parameters:
reader
- the reader from which the cost matrix is to be read
- Throws:
- java.lang.Exception - if the cost matrix does not have the
right format
initialize
public void initialize()
Sets the costs to default values (i.e. 0 down the diagonal, and 1 for
any misclassification).
size
public int size()
Gets the number of classes.
- Returns:
- the number of classes
normalize
public void normalize()
Normalizes the cost matrix so that diagonal elements are zero. The value
of non-zero diagonal elements is subtracted from the row containing the
value. For example:
2 5
3 -1
becomes
0 3
4 0
This normalization will affect total classification cost during
evaluation, but will not affect the decision made by applying minimum
expected cost criteria during prediction.
applyCostMatrix
public Instances applyCostMatrix(Instances instances,
java.util.Random random) throws java.lang.Exception
Changes the dataset to reflect a given set of costs.
Sets the weights of instances according to the misclassification
cost matrix, or does resampling according to the cost matrix (if
a random number generator is provided). Returns a new dataset.
- Parameters:
instances
- the instances to apply cost weights to.
random
- a random number generator
- Returns:
- the new dataset
- Throws:
- java.lang.Exception - if the cost matrix does not have the right
format
expectedCosts
public double[] expectedCosts(double probabilities[]) throws java.lang.Exception
Calculates the expected misclassification cost for each possible
class value, given class probability estimates.
- Parameters:
probabilities
- an array containing probability estimates for each
class value.
- Returns:
- an array containing the expected misclassification cost for each
class.
- Throws:
- java.lang.Exception - if the number of probabilities does not match the
number of classes.
getMaxCost
public double getMaxCost(int actualClass)
Gets the maximum misclassification cost possible for a given actual
class value
- Parameters:
actualClass
- the index of the actual class value
- Returns:
- the highest cost possible for misclassifying this class
main
public static void main(java.lang.String args[])
Tests out creation of a frequency dependent cost matrix from the command
line. Either pipe a set of instances into system.in or give the name of
a dataset as an argument. The last column will be treated as the class
attribute and a cost matrix with weight 1000 output.
- Parameters:
[]args
- a value of type 'String'
All Packages Class Hierarchy This Package Previous Next Index WEKA's home