Search code examples

I am only getting `accuracy_score` instead of `roc_auc` for XGBClassifier in both GridSearch and cross validation

I am using XGBClassifier for the Rain in Australia dataset and trying to predict whether it will rain today or not. I wanted to tune the hyperparameters of the classifier with GridSearch and score it with ROC_AUC. Here is my code:

param_grid = {
    "max_depth": [3, 4, 5, 7],
    "gamma": [0, 0.25, 1],
    "reg_lambda": [0, 1, 10],
    "scale_pos_weight": [1, 3, 5],
    "subsample": [0.8],  # Fix subsample
    "colsample_bytree": [0.5],  # Fix colsample_bytree

from sklearn.model_selection import GridSearchCV

# Init the classifier
xgb_cl = xgb.XGBClassifier(objective="binary:logistic", verbose=0)

# Init the estimator
grid_cv = GridSearchCV(xgb_cl, param_grid, scoring="roc_auc", n_jobs=-1)

# Fit
_ =, y)

When the search is finally done, I am getting the best score with .best_score_ but somehow only getting an accuracy score instead of ROC_AUC. I thought this was only the case with GridSearch, so I tried HalvingGridSearchCV and cross_val_score with scoring set to roc_auc but I got accuracy score for them too. I checked this by manually computing ROC_AUC with sklearn.metrics.roc_auc_score.

Is there anything I am doing wrong or what is the reason for this behavior?


  • Have you tried your own roc_auc scoring rule? It seems like you are passing labels instead of probabilities (you originally need) for roc_auc.

    problem described in here: Different result roc_auc_score and plot_roc_curve

    Solutions for own scorers: Grid-Search finding Parameters for AUC


    Sorry, saw today that my introduction text from the notebook was missing lol

    When calculating roc_auc_score you have the option (it doesnt matter, if it is with or without gridsearch, with or without pipeline) that you can pass it labels like (0/1) or probabilities like (0.995, 0.6655). The first should be easy available if you just convert your probas to labels. However that would result in a (straight reversed L) output plot. That looks sometimes ugly. the other option is to use predicted probabilites to pass them them to the roc_auc_score. That would result in a (staircase reversed L) output plot, which looks much better. So what you first should test is, can you get a roc auc score with labels, with and without grid, if that is the case. You should then try to get probabilities. And there, I believe, you have to write your own scoring method, as the roc-auc_score in grid only serves labels, that would result in high roc_auc scores. I wrote something for you, so you can see the label approach:

    import xgboost as xgb
    from sklearn.metrics import confusion_matrix
    from sklearn.datasets import load_breast_cancer
    from sklearn.metrics import roc_auc_score
    cancer = load_breast_cancer()
    X =
    y =
    xgb_model = xgb.XGBClassifier(objective="binary:logistic", 
                                  colsample_bytree = 0.3, 
                                  learning_rate = 0.1,
                                  max_depth = 5, 
                                  gamma = 10, 
                                  n_estimators = 10,
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42), y_train)
    preds = xgb_model.predict(X_test)
    print(confusion_matrix(preds, y_test))
    print ('ROC AUC Score',roc_auc_score(y_test,preds))


    [[51  2]  
    [ 3 87]] 
    ROC AUC Score 0.9609862671660424

    Here you can see it is ridicoulous high.

    If you wanna do it with grid: get rid of this:

    # Fit
    _ =, y)

    just, y) fit is a method applied to grid_cv and results are stored within grid_cv

    print(grid_cv.best_score_) should deliver the auc as you already have defined it. See also: different roc_auc with XGBoost gridsearch scoring='roc_auc' and roc_auc_score? But this should also be ridicoulos high, as you will be probably serving labels instead of probas.

    beware also of: What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

    And nobody hinders you to apply the roc-auc_score function to your grid_results...