I have some class imbalance and a simple baseline classifier that assigns the majority class to every sample:
from sklearn.metrics import precision_score, recall_score, confusion_matrix
y_true = [0,0,0,1]
y_pred = [0,0,0,0]
confusion_matrix(y_true, y_pred)
This yields
[[3, 0],
[1, 0]]
This means TP=3, FP=1, FN=0.
So far, so good. Now I want to calculate the micro average of precision and recall.
precision_score(y_true, y_pred, average='micro') # yields 0.75
recall_score(y_true, y_pred, average='micro') # yields 0.75
I am Ok with the precision, but why is recall not 1.0? How can they ever be the same in this example, given that FP > 0 and FN == 0? I know it must have to do with the micro averaging, but I can't wrap my head around this one.
Yes, its because of micro-averaging. See the documentation here to know how its calculated:
Note that if all labels are included, “micro”-averaging in a multiclass setting will produce precision, recall and f-score that are all identical to accuracy.
As you can see in the above linked page, both precision and recall are defined as:
where R(y, y-hat) is:
So in your case, Recall-micro will be calculated as
R = number of correct predictions / total predictions = 3/4 = 0.75