Search code examples
scikit-learnprecision-recall

Micro F1 score in Scikit-Learn with Class imbalance


I have some class imbalance and a simple baseline classifier that assigns the majority class to every sample:

from sklearn.metrics import precision_score, recall_score, confusion_matrix

y_true = [0,0,0,1]
y_pred = [0,0,0,0]
confusion_matrix(y_true, y_pred)

This yields

[[3, 0],

[1, 0]]

This means TP=3, FP=1, FN=0.

So far, so good. Now I want to calculate the micro average of precision and recall.

precision_score(y_true, y_pred, average='micro') # yields 0.75
recall_score(y_true, y_pred, average='micro') # yields 0.75

I am Ok with the precision, but why is recall not 1.0? How can they ever be the same in this example, given that FP > 0 and FN == 0? I know it must have to do with the micro averaging, but I can't wrap my head around this one.


Solution

  • Yes, its because of micro-averaging. See the documentation here to know how its calculated:

    Note that if all labels are included, “micro”-averaging in a multiclass setting will produce precision, recall and f-score that are all identical to accuracy.

    As you can see in the above linked page, both precision and recall are defined as: enter image description here

    where R(y, y-hat) is:

    enter image description here

    So in your case, Recall-micro will be calculated as

    R = number of correct predictions / total predictions = 3/4 = 0.75