pandas machine-learning scikit-learn xgboost

I am only getting `accuracy_score` instead of `roc_auc` for XGBClassifier in both GridSearch and cross validation

I am using XGBClassifier for the Rain in Australia dataset and trying to predict whether it will rain today or not. I wanted to tune the hyperparameters of the classifier with GridSearch and score it with ROC_AUC. Here is my code:

param_grid = {
    "max_depth": [3, 4, 5, 7],
    "gamma": [0, 0.25, 1],
    "reg_lambda": [0, 1, 10],
    "scale_pos_weight": [1, 3, 5],
    "subsample": [0.8],  # Fix subsample
    "colsample_bytree": [0.5],  # Fix colsample_bytree
}

from sklearn.model_selection import GridSearchCV

# Init the classifier
xgb_cl = xgb.XGBClassifier(objective="binary:logistic", verbose=0)

# Init the estimator
grid_cv = GridSearchCV(xgb_cl, param_grid, scoring="roc_auc", n_jobs=-1)

# Fit
_ = grid_cv.fit(X, y)

When the search is finally done, I am getting the best score with .best_score_ but somehow only getting an accuracy score instead of ROC_AUC. I thought this was only the case with GridSearch, so I tried HalvingGridSearchCV and cross_val_score with scoring set to roc_auc but I got accuracy score for them too. I checked this by manually computing ROC_AUC with sklearn.metrics.roc_auc_score.

Is there anything I am doing wrong or what is the reason for this behavior?

Solution

Have you tried your own roc_auc scoring rule? It seems like you are passing labels instead of probabilities (you originally need) for roc_auc.

problem described in here: Different result roc_auc_score and plot_roc_curve

Solutions for own scorers: Grid-Search finding Parameters for AUC

Update2

Sorry, saw today that my introduction text from the notebook was missing lol

When calculating roc_auc_score you have the option (it doesnt matter, if it is with or without gridsearch, with or without pipeline) that you can pass it labels like (0/1) or probabilities like (0.995, 0.6655). The first should be easy available if you just convert your probas to labels. However that would result in a (straight reversed L) output plot. That looks sometimes ugly. the other option is to use predicted probabilites to pass them them to the roc_auc_score. That would result in a (staircase reversed L) output plot, which looks much better. So what you first should test is, can you get a roc auc score with labels, with and without grid, if that is the case. You should then try to get probabilities. And there, I believe, you have to write your own scoring method, as the roc-auc_score in grid only serves labels, that would result in high roc_auc scores. I wrote something for you, so you can see the label approach:

import xgboost as xgb
from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import roc_auc_score

cancer = load_breast_cancer()

X = cancer.data
y = cancer.target

xgb_model = xgb.XGBClassifier(objective="binary:logistic", 
                              eval_metric="auc", 
                              use_label_encoder=False,
                              colsample_bytree = 0.3, 
                              learning_rate = 0.1,
                              max_depth = 5, 
                              gamma = 10, 
                              n_estimators = 10,
                              verbosity=None)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) 
xgb_model.fit(X_train, y_train)
preds = xgb_model.predict(X_test)

print(confusion_matrix(preds, y_test))
print ('ROC AUC Score',roc_auc_score(y_test,preds))

Gives:

[[51  2]  
[ 3 87]] 
ROC AUC Score 0.9609862671660424

Here you can see it is ridicoulous high.

If you wanna do it with grid: get rid of this:

# Fit
_ = grid_cv.fit(X, y)

just grid_cv.fit(x, y) fit is a method applied to grid_cv and results are stored within grid_cv

print(grid_cv.best_score_) should deliver the auc as you already have defined it. See also: different roc_auc with XGBoost gridsearch scoring='roc_auc' and roc_auc_score? But this should also be ridicoulos high, as you will be probably serving labels instead of probas.

beware also of: What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

And nobody hinders you to apply the roc-auc_score function to your grid_results...