Search code examples
scikit-learnrandom-forestgridsearchcv

What will GridsearchCV choose if there are multiple estimators having the same score?


I'm using RandomForestClassifier in sklearn, and using GridsearchCV for getting best estimator.

I'm wondering when there are many estimators (from simple one to complex one) having the same scores in GridsearchCV, what will be the resulted estimator out of GridsearchCV? The simplest one? or random one?


Solution

  • GridSearchCV does not assess the model complexity (though that would be a neat feature). Neither does it choose among the best models randomly.

    Instead, GridSearchCV simply performs an np.argmin() on the stored errors. See the corresponding line in the source code.

    Now, according to the NumPy docs,

    In case of multiple occurrences of the minimum values, the indices corresponding to the first occurrence are returned.

    That is, GridSearchCV will always select the first among the best models.