As in title i was wondering where i can check which decision tree algorithms is used by RandomForestClassifier in scikit-learn. It says in attributes base_estimator_ = DecisionTreeClassifier
, then behind DecisionTreeClassifier in scikitlearn is CART so is it my answer?
link to scikit-learn RandomForest
Any suggestions would be appreciated
Scikit-learn uses an optimized version of CART by default (https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-5-c5-0-and-cart).
It constructs trees by 'using the feature and threshold that yield the largest information gain'. The function to measure the quality of the split (a.k.a. largest information gain) in the trees can be set using the criterion
parameter in the RandomForestClassifier.
The default function is the gini impurity
, but you can also select entropy
. In practice these two are quite similar, but you can find more information here: https://datascience.stackexchange.com/questions/10228/when-should-i-use-gini-impurity-as-opposed-to-information-gain-entropy