I can't figure out why the Sklearn
function roc_auc_score
returns 1
in the following case:
y_true = [0, 0, 1, 0, 0, 0, 0, 1]
y_scores = [0.18101096153259277, 0.15506085753440857,
0.9940806031227112, 0.05024950951337814,
0.7381414771080017, 0.8922111988067627,
0.8253260850906372, 0.9967281818389893]
roc_auc_score(y_true,y_scores)
The three scores 0.7381414771080017, 0.8922111988067627, 0.8253260850906372
at the end don't match the labels 0, 0, 0
. So, how can the AUC be 1? What am I getting wrong here?
The auc of ROC curve just measures the ability of your model to rank order the datapoints, with respect to your positive class.
In your example, the score of the positive class is always greater than the negative class datapoints. Hence, the auc_roc_score of 1 is correct.
pd.DataFrame({'y_true':y_true,'y_scores':y_scores}).sort_values('y_scores',ascending=False)
y_scores y_true
7 0.996728 1
2 0.994081 1
5 0.892211 0
6 0.825326 0
4 0.738141 0
0 0.181011 0
1 0.155061 0
3 0.050250 0