The tree._feature for DecisionTreeClassifier and DecisionTreeRegressor returns the value -2 a few times in the end. Is this because they are leaf nodes? Can I assume any -2 value as leaf node features?
Generally, yes. The object variable .tree_.feature
returns -2
when there is no split on that node, which occurs if and only if the node is a leaf (when the tree was grown without pruning, ie. fit(..., ccp_alpha=0)
).
While it's not 100% clear in the help()
documentation, a reference to this can be found inside the code here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx
where the feature
value is set to the static variable TREE_UNDEFINED = -2
.