Search code examples
pythonscikit-learngridsearchcv

Using Multiple Metric Evaluation with GridSearchCV


I am attempting to use multiple metrics in GridSearchCV. My project needs multiple metrics including "accuracy" and "f1 score". However, after following the sklearn models and online posts, I can't seem to get mine to work. Here is my code:

from sklearn.model_selection import GridSearchCV
from sklearn.metrics import f1_score
clf = KNeighborsClassifier()

param_grid = {'n_neighbors': range(1,30), 'algorithm': ['auto','ball_tree','kd_tree', 'brute'], 'weights': ['uniform', 'distance'],'p': range(1,5)}

#Metrics for Evualation:
met_grid= ['accuracy', 'f1'] #The metric codes from sklearn

custom_knn = GridSearchCV(clf, param_grid, scoring=met_grid, refit='accuracy', return_train_score=True)

custom_knn.fit(X_train, y_train)
y_pred = custom_knn.predict(X_test)

My error occurs on the custom_knn.fit(X_train,y_train). Further more, if you comment-out the scoring=met_grid, refit='accuracy', return_train_score=True, it works. Here is my error:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

Also, if you could explain multiple metric evaluation or refer me to someone who can, that would be much appreciated!
Thanks


Solution

  • f1 is a binary classification metric. For multi-class classification, you have to use averaged f1 based on different aggregation. You can find the exhaustive list of scoring available in Sklearn here.

    Try this!

    scoring = ['accuracy','f1_macro']
    
    custom_knn = GridSearchCV(clf, param_grid, scoring=scoring, 
                              refit='accuracy', return_train_score=True,cv =3)