python scikit-learn cross-validation k-fold

Does GridSearchCV return the best_estimator_ after fitting?

Let's say we tune an SVM with GridSearch like this:

algorithm = SVM()
parameters = {'kernel': ['rbf', 'sigmoid'], 'C': [0.1, 1, 10]}

grid= GridSearchCV(algorithm, parameters)
grid.fit(X, y)

You then wish to use the best fit parameters/estimator in a cross_val_score. My question is, which model is grid at this point? Is it the best performing one? In other words, can we just do

cross_val_scores = cross_val_score(grid, X=X, y=y)

or should we use

cross_val_scores = cross_val_score(grid.best_estimator_, X=X, y=y)

When I run both, I find that they do not return the same scores so I am curious what the correct approach is here. (I would assume using the best_estimator_.) That raises another question, though, namely: what does using just grid use as a model then? The first one?

Solution

You don't need cross_val_score after fitting a GridSearchCV. It already has attributes that allow you to access cross validation scores. cv_results_ gives you all. You can index into this with the best_index attribute if you want to see only that specific estimator's results.

cv_results = pd.DataFrame(grid.cv_results_)
cv_results.iloc[grid.best_index_]
mean_fit_time                        0.00046916
std_fit_time                         1.3785e-05
mean_score_time                     0.000251055
std_score_time                      1.19038e-05
param_C                                      10
param_kernel                                rbf
params               {'C': 10, 'kernel': 'rbf'}
split0_test_score                      0.966667
split1_test_score                             1
split2_test_score                      0.966667
split3_test_score                      0.966667
split4_test_score                             1
mean_test_score                            0.98
std_test_score                        0.0163299
rank_test_score                               1
Name: 5, dtype: object

Most of the methods you call on a fitted GridSearchCV use the best model (grid.predict(...) gets you the predictions for the best model, for example). This is not true for the estimator. The difference you see is probably comes from that. cross_val_score fits it again, but this time makes the scoring against grid.estimator but not grid.best_estimator_.