machine-learning classification logistic-regression multiclass-classification precision-recall

When do micro- and macro-averages differ a lot?

I am learning Machine Learning theory. I have a confusion matrix of a prediction using a Logistic Regression with multiple classes.

Now I have calculated the micro and macro averages (precision & recall).

The values are quite different. Now I wonder which factors influence this. Under which conditions does it happen that micro and macro differ much?

What I noticed is that the accuracies of the predictions differ for the different classes. Is this the reason? Or what other factors can cause this?

The sample confusion matrix:

And my calculated micro-macro-averages:

precision-micro = ~0.7329
recall-micro = ~0,7329

precision-macro = ~0.5910
recall-macro = ~0.6795

Solution

The difference between micro and macro averages becomes apparent in imbalanced datasets.

The micro average is a global strategy that basically ignores that there is a distinction between classes. It is calculated by counting the total true positives, false negatives and false positives over all classes.

In classification tasks where the underlying problem is not a multilabel classification, the micro average actually equals the accuracy score. See that your micro precision and recall are equal. Compute the accuracy score and compare, you will see no difference.

In case of macro average, the precision and recall are calculated for each label separately and reported as their unweighted mean. Depending on how your classifier performs on each class, this can heavily influence the result.

You can also refer to this answer of mine, where it has been addressed in a bit more detail.