Search code examples
pythonmachine-learningscikit-learnsvmgrid-search

The test accuracy score is higher than the best score in GridSearchCV


I'm using GridSearchCV to find the best hyperparameter in my SVM model. But I am a little bit confused about the scoring. This is my grid search code:

# Train SVM with GridSearchCV
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

pipe = Pipeline([
    ('scaler', StandardScaler()), 
    ('SVM', SVC(kernel='rbf', decision_function_shape='ovo'))
])

param_grid = {
                'SVM__C': [1, 10, 100, 1000],
                'SVM__gamma': [1, 0.1, 0.01, 0.001]
            }

clf = GridSearchCV(pipe, param_grid, scoring='accuracy', verbose = 3, cv=5)
clf.fit(X_train, y_train)

Output:

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                       ('SVM',
                                        SVC(decision_function_shape='ovo'))]),
             param_grid={'SVM__C': [1, 10, 100, 1000],
                         'SVM__gamma': [1, 0.1, 0.01, 0.001]},
             scoring='accuracy', verbose=3)

Then I tried to print the best score and the test accuracy

print('Best score: ', clf.best_score_)
print('Test Accuracy: ', clf.score(X_test, y_test)

And it returns

Best score:  0.5501906602583355
Test accuracy:  0.5809569840502659

Why the score between the two is different? As far as I know, the best_score_ is the max value of the mean_test_score in cv_results_, but why is the test accuracy score is higher than the best score? I am still confused about this.


Solution

  • TLDR: The two scores are not referring to the same 'test' set. One is looking at the 'test' scores from the CV and the other is from the separate test set.

    This is because the CV (cross validation) is done on the training data provided (here X_train and y_train). The best_score is the best score produced on the test folds from your training data.

    On the other hand, clf.score(X_test, y_test) is giving you the score (accuracy) on your test set. These two do not (and in general will not) be equal. This test data is not part of your training data - or at least should not be.