Search code examples
c++decision-treeadaboost

Decision trees / stumps with Adaboost


I just started learning about decision trees with Adaboost and am trying it out on OpenCV and have some questions.

Boosted Decision Trees

I understand that when I use Adaboost with Decision Trees, I continuously fit Decision Trees to reweighted version of the training data. Classification is done by a weighted majority vote

Can I instead use Bootstrapping when training Decision Trees with Adaboost ? i.e. we select subsets of our dataset and train a tree on each subset before feeding the classifiers into Adaboost.

Boosted Decision Stumps

Do I use the same technique for Decision Stumps ? Or can I instead create stumps equal to the number of features ? I.e. if I have 2 classes with 10 features, I create a total of of 10 Decision Stumps for each feature before feeding the classifiers into Adaboost.


Solution

  • AdaBoost not only trains the classifier on different subsets, but also adjusts the weights of the dataset elements depending on the assemble performance reached. The detailed description may be found here.

    Yes, you can use the same technique to train decision stumps. The algorithm is approximately the following:

    1. Train the decision stump on the initial dataset with no weights (the same as each element having weight = 1).
    2. Update weights of all elements, using the formula from AdaBoost algorithm. Weights of correctly classified elements should become less, weights of incorrectly classified - larger.
    3. Train the decision stump using the current weights. That is, minimize not just the number of mistakes made by this decision stump, but sum of the weights of the mistakes.
    4. If the desired quality was not achieved, go to pt. 2.