Search code examples
decision-treerapidminer

Rapidminer Decision Tree results


Sorry I am new to rapidminer and am a bit confused about the results of my decision tree. I have increased the minimal leaf size which has resulted in a smaller more readable decision tree, however I lost 2% in my accuracy results. I have been asked to explain why, but I'm at a loss as to what a better tree but less accuracy says about my data.

Any help us much appreciated.

Neil.


Solution

  • Increasing the minimal leaf size prevents very specialized leaves, which only "match" very few examples. This is usually done to prevent overfitting, which in its extreme form is simply reproducing the training data with 100% accuracy. Still, you expect to make the model more robust on real, new data. So depending on your way to estimate the accuracy, you expect some drop in accuracy when forcing the algorithm to create a more general model.

    Also, when talking about a "more readable" tree, don't try to interpret the nodes individually. Don't assume the attribute in the root node is the most important one. It is always the combination of the attributes on your way to the leaf.