Search code examples
machine-learningartificial-intelligenceclassificationdecision-treetermination

Stopping condition when building decision trees


I am writing my own code for a decision tree. I need to decide on when to terminate the tree building process. I thought of limiting the height of the tree, but this seems trivial. Could anyone give me a better idea on how to define the stopping condition?


Solution

  • There is little context in your question, but I assume your are constructing a tree from a large set of data? In that case, a solution is in addition to a "LearnSet" to take a "StopSet" of examples and regularly verify your decision making process on this StopSet. If quality decreases, this is an indication that your are overtraing on the LearnSet.

    I deliberately use "StopSet" and not "TestSet" because after this you should apply your decision tree on the TestSet to assess the real quality.