Search code examples
algorithmmachine-learningartificial-intelligencerandom-forestdecision-tree

Possible Algorithms for Random Forest


I am doing research about Random Forests and I was searching for Algorithms for Random Forests.

I have already looked up Algorithms for Decision Trees (like ID3, C4.5, CART).

But what are different Algorithms for Random Forest? I didn't fully understand it with literature.

Could you say bagging and ExtraTrees are examples?

Thanks in advance


Solution

  • Any tree ensemble (i.e forest), that relies on various ways of injecting randomness to grow diverse and uncorrelated trees, can be called random forest. All variants of random forests is based on the same principle that the more diverse we can make the individual trees, the lower will be the resulting generalization error.

    One such way of injecting randomness is called Bootstrap Aggregating (Bagging), which injects randomness in datasets sent to each tree**. Another is Random Subspace method, that basically randomly samples a subset of features at each tree node, to find the best (feature, value) split (instead of considering all features). Here the randomness lies in tree building process. ExtraTree is another example that introduces randomness in tree building phase, first by randomly selecting cut-point for each feature, then choosing the best (feature, value) split. An interesting variant intentionally introduces label noise independently in each base tree's dataset- I think you get the point.

    However, for many, the term Random Forest actually means the most famous member of random forest family, the variant detailed in Breiman's famous paper. This basically uses both the Bagging and Random subspace method discussed above, and that's just it!

    **Dataset randomization techniques, like bagging or that label noise one, can be used with any algorithm beside decision tree. So Bagging isn't exactly an example of Random Forest- it's more like a component of Random Forest.