Main Page | Modules | Data Structures | File List | Globals | Related Pages

DecisionTree.h File Reference


Detailed Description

A Decision Tree Structure.

This is the interface for creating, using, printing, & serializing Decision Trees. A decision tree is a recursive structure. Each internal node partitions the data based on the values of an attribute. Each leaf contains a prediction for the distinguished target attribute. For a more detailed discussion see chapter 3 of Tom Mitchell's book on machine learning.

Note that all the DecisionTrees created with an ExampleSpec maintain a pointer to the it; you shouldn't free or modify the ExampleSpec until you are done with all the DecisionTrees referencing it.

Wish List:
A standard in memory decision tree induction algorithm. Maybe the best starting point would be the decisionstump learner.

This isn't the right place for this wish, but it would be nice to have a RuleSet structure similar to this DecisionTree structure

Go to the source code of this file.

Data Structures

struct  _DecisionTree_
 ADT for working with decision trees. More...


Typedefs

typedef _DecisionTree_ DecisionTree
 ADT for working with decision trees.

typedef _DecisionTree_DecisionTreePtr
 ADT for working with decision trees.


Functions

DecisionTreePtr DecisionTreeNew (ExampleSpecPtr spec)
 Creates a new decision tree node.

void DecisionTreeFree (DecisionTreePtr dt)
 Frees the memory associated with the decision tree and all of its children.

DecisionTreePtr DecisionTreeClone (DecisionTreePtr dt)
 Recursively creates a copy of dt and returns it.

int DecisionTreeIsLeaf (DecisionTreePtr dt)
 Returns 1 if dt is a leaf node and 0 otherwise.

int DecisionTreeIsTreeGrowing (DecisionTreePtr dt)
 Returns 1 if dt is a growing node has any children which are, and 0 otherwise.

int DecisionTreeIsNodeGrowing (DecisionTreePtr dt)
 Returns 1 if dt is a growing node and 0 otherwise.

int DecisionTreeGetClass (DecisionTreePtr dt)
 Returns the index of the class which dt predicts.

void DecisionTreeSetClass (DecisionTreePtr dt, int theClass)
 Sets dt's class prediction to theClass.

void DecisionTreeAddToClassDistribution (DecisionTreePtr dt, ExamplePtr e)
 Put the example in the class distribution at the node.

float DecisionTreeGetClassProb (DecisionTreePtr dt, int theClass)
 Returns the probability of the class.

void DecisionTreeSetClassProb (DecisionTreePtr dt, int theClass, float prob)
 Sets the probability of the class.

float DecisionTreeGetClassDistributionSampleCount (DecisionTreePtr dt)
 Returns the number of samples added to the node's distribution.

void DecisionTreeZeroClassDistribution (DecisionTreePtr dt)
 Sets the nodes distribution to zeros.

void DecisionTreeSetTypeLeaf (DecisionTreePtr dt)
 Changes dt into a leaf node without changing its class prediction.

void DecisionTreeSetTypeGrowing (DecisionTreePtr dt)
 Changes dt into a growing node.

void DecisionTreeSplitOnDiscreteAttribute (DecisionTreePtr dt, int attNum)
 Changes dt into a discrete split.

void DecisionTreeSplitOnContinuousAttribute (DecisionTreePtr dt, int attNum, float threshold)
 Changes dt into a continuous split.

int DecisionTreeGetChildCount (DecisionTreePtr dt)
 Returns a count of the direct decendants of dt.

DecisionTreePtr DecisionTreeGetChild (DecisionTreePtr dt, int index)
 Returns one of the direct decendants of dt.

DecisionTreePtr DecisionTreeOneStepClassify (DecisionTreePtr dt, ExamplePtr e)
 Does one step of classifing e with dt.

int DecisionTreeClassify (DecisionTreePtr dt, ExamplePtr e)
 Uses dt to classify e and returns the index of the predicted class.

void DecisionTreeGatherGrowingNodes (DecisionTreePtr dt, VoidAListPtr list)
 Searches dt and appends all its growing nodes to the passed list.

void DecisionTreeGatherLeaves (DecisionTreePtr dt, VoidAListPtr list)
 Searches dt and appends all its leaf nodes to the passed list.

int DecisionTreeCountNodes (DecisionTreePtr dt)
 Returns a count of the number of nodes (of any type) in dt.

int DecisionTreeGetMostCommonClass (DecisionTreePtr dt)
 Return the index of the class that is predicted most commonly by leaf nodes in dt.

void DecisionTreeSetGrowingData (DecisionTreePtr dt, void *data)
 Each decision tree node has a pointer reserved for your use.

void * DecisionTreeGetGrowingData (DecisionTreePtr dt)
 Each decision tree node has a pointer reserved for your use.

void DecisionTreePrint (DecisionTreePtr dt, FILE *out)
 Prints the decision tree to the passed file.

void DecisionTreePrintStats (DecisionTreePtr dt, FILE *out)
 Prints counts of leaves at each level of the tree.

DecisionTreePtr DecisionTreeReadC45 (FILE *in, ExampleSpecPtr spec)
 Attempts to read a decision tree from the passed file.

DecisionTreePtr DecisionTreeRead (FILE *in, ExampleSpecPtr spec)
 Attempts to read a decision tree from the passed file.

void DecisionTreeWrite (DecisionTreePtr dt, FILE *out)
 Writes the decision tree to the passed file.


Typedef Documentation

typedef struct _DecisionTree_ DecisionTree
 

ADT for working with decision trees.

See DecisionTree.h for more detail.

typedef struct _DecisionTree_ * DecisionTreePtr
 

ADT for working with decision trees.

See DecisionTree.h for more detail.


Function Documentation

void DecisionTreeAddToClassDistribution DecisionTreePtr  dt,
ExamplePtr  e
 

Put the example in the class distribution at the node.

Updates the class distribution at the node with the class of the example. This does not make recursive calls, so you should use DecisionTreeOneStepClassify and add it everywhere until you get to a leaf (if that is what you intend).

int DecisionTreeClassify DecisionTreePtr  dt,
ExamplePtr  e
 

Uses dt to classify e and returns the index of the predicted class.

DecisionTreePtr DecisionTreeClone DecisionTreePtr  dt  ) 
 

Recursively creates a copy of dt and returns it.

This function copies the user data pointers, but doesn't copy the data they point to.

int DecisionTreeCountNodes DecisionTreePtr  dt  ) 
 

Returns a count of the number of nodes (of any type) in dt.

void DecisionTreeFree DecisionTreePtr  dt  ) 
 

Frees the memory associated with the decision tree and all of its children.

This function doesn't do anything with user growing data you may have attached using DecisionTreeSetGrowingData; you must deal with that before calling this function.

void DecisionTreeGatherGrowingNodes DecisionTreePtr  dt,
VoidAListPtr  list
 

Searches dt and appends all its growing nodes to the passed list.

void DecisionTreeGatherLeaves DecisionTreePtr  dt,
VoidAListPtr  list
 

Searches dt and appends all its leaf nodes to the passed list.

DecisionTreePtr DecisionTreeGetChild DecisionTreePtr  dt,
int  index
 

Returns one of the direct decendants of dt.

Index should be between 0 and DecisionTreeGetChildCount(dt) - 1. For nodes that split on continuous attributes use index 0 for the left child (<) and index 1 for the right child (>=).

int DecisionTreeGetChildCount DecisionTreePtr  dt  ) 
 

Returns a count of the direct decendants of dt.

That is, return a count of all the nodes that you can reach from dt by taking one step towards the leaves.

int DecisionTreeGetClass DecisionTreePtr  dt  ) 
 

Returns the index of the class which dt predicts.

This makes the most sense if dt is a Leaf node, but may be useful at other times as well.

float DecisionTreeGetClassDistributionSampleCount DecisionTreePtr  dt  ) 
 

Returns the number of samples added to the node's distribution.

float DecisionTreeGetClassProb DecisionTreePtr  dt,
int  theClass
 

Returns the probability of the class.

Returns what portion of the examples that were added to the class distribution at this node have the associated class.

void* DecisionTreeGetGrowingData DecisionTreePtr  dt  ) 
 

Each decision tree node has a pointer reserved for your use.

Use the GetGrowingData function to access the value of the pointer.

int DecisionTreeGetMostCommonClass DecisionTreePtr  dt  ) 
 

Return the index of the class that is predicted most commonly by leaf nodes in dt.

int DecisionTreeIsLeaf DecisionTreePtr  dt  ) 
 

Returns 1 if dt is a leaf node and 0 otherwise.

int DecisionTreeIsNodeGrowing DecisionTreePtr  dt  ) 
 

Returns 1 if dt is a growing node and 0 otherwise.

int DecisionTreeIsTreeGrowing DecisionTreePtr  dt  ) 
 

Returns 1 if dt is a growing node has any children which are, and 0 otherwise.

DecisionTreePtr DecisionTreeNew ExampleSpecPtr  spec  ) 
 

Creates a new decision tree node.

You should use the accessor methods to initialize it and attach it to an existing DecisionTree as needed.

DecisionTreePtr DecisionTreeOneStepClassify DecisionTreePtr  dt,
ExamplePtr  e
 

Does one step of classifing e with dt.

Returns the direct decendant of dt corresponding to the correct value of dt's test attribute. If dt is a leaf or growing node this function will return dt.

void DecisionTreePrint DecisionTreePtr  dt,
FILE *  out
 

Prints the decision tree to the passed file.

FILE * should be opened for writing. The decision tree will be written so as to be understandable by humans. Your mileage may vary.

Note that you could pass STDOUT to the function to write a decision tree to the console.

void DecisionTreePrintStats DecisionTreePtr  dt,
FILE *  out
 

Prints counts of leaves at each level of the tree.

The passed FILE * should be opened for writing. Note that you could pass STDOUT to the function to write the stats to the console.

DecisionTreePtr DecisionTreeRead FILE *  in,
ExampleSpecPtr  spec
 

Attempts to read a decision tree from the passed file.

FILE * should be opened for reading. Attaches the ExampleSpec to the read decision tree.

This function allocates memory which should be freed by calling DecisionTreeFree.

DecisionTreePtr DecisionTreeReadC45 FILE *  in,
ExampleSpecPtr  spec
 

Attempts to read a decision tree from the passed file.

FILE * should be opened for reading. The file, in, should contain a decision tree written in C4.5's binary format, not the pretty-printed text format. A run of C4.5 with its default arguments will produce 2 such files, stem.tree and stem.unpruned.

This function handles leaves, continuous splits, and discrete splits and will not be able to read trees built with C4.5's subsetting options.

void DecisionTreeSetClass DecisionTreePtr  dt,
int  theClass
 

Sets dt's class prediction to theClass.

Does not change dt's type to leaf node. This might be useful for anytime algorithms where a growing node needs to contain a reasonable prediction at all times.

void DecisionTreeSetClassProb DecisionTreePtr  dt,
int  theClass,
float  prob
 

Sets the probability of the class.

Changes the probability of the class without changing the sample count (unless the sample count was zero in which case it is set to 1).

void DecisionTreeSetGrowingData DecisionTreePtr  dt,
void *  data
 

Each decision tree node has a pointer reserved for your use.

Use the SetGrowingData function to change the value of the pointer. You can set the pointer to anything you like (for example, to store sufficient statistics on growing nodes), but remember that you are responsible to manage any memory that it points to.

void DecisionTreeSetTypeGrowing DecisionTreePtr  dt  ) 
 

Changes dt into a growing node.

void DecisionTreeSetTypeLeaf DecisionTreePtr  dt  ) 
 

Changes dt into a leaf node without changing its class prediction.

If dt is not a growing node this function also frees all of dt's children. Remember that you are responsible for anything stored in any of dt's children's growing pointers and you should clean up these pointers before calling this function.

void DecisionTreeSplitOnContinuousAttribute DecisionTreePtr  dt,
int  attNum,
float  threshold
 

Changes dt into a continuous split.

The new node splits on a threshold on a continuous attribute and adds children to dt for values of attNum < and >= the threshold. The created children start as growing nodes.

void DecisionTreeSplitOnDiscreteAttribute DecisionTreePtr  dt,
int  attNum
 

Changes dt into a discrete split.

The new node splits on the values of a discrete attribute and adds one child to dt for each value of attribute attNum. The created children start as growing nodes.

void DecisionTreeWrite DecisionTreePtr  dt,
FILE *  out
 

Writes the decision tree to the passed file.

FILE * should be opened for writing. The decision tree will be written in a binary format suitable to be read by DecisionTreeRead, but this function ignores any growing data that you've associated with dt -- if you need to save growing data you will need to serialize it some other way.

Note that you could pass STDOUT to the function to write an example to the console.

void DecisionTreeZeroClassDistribution DecisionTreePtr  dt  ) 
 

Sets the nodes distribution to zeros.


Generated for VFML by doxygen hosted by SourceForge.net Logo