I' m working on NER and using sklearn.metrics.classification_report
to caculate micro and macro f1 score. It printed a table like:
precision recall f1-score support
0 0.0000 0.0000 0.0000 0
3 0.0000 0.0000 0.0000 0
4 0.8788 0.9027 0.8906 257
5 0.9748 0.9555 0.9650 1617
6 0.9862 0.9888 0.9875 1156
7 0.9339 0.9138 0.9237 835
8 0.8542 0.7593 0.8039 216
9 0.8945 0.8575 0.8756 702
10 0.9428 0.9382 0.9405 1668
11 0.9234 0.9139 0.9186 1661
accuracy 0.9285 8112
macro avg 0.7388 0.7230 0.7305 8112
weighted avg 0.9419 0.9285 0.9350 8112
It's obvious that the predicted labels have '0' or '3', but there's no '0' or '3' in true labels. Why the classification report will show these two classes which don't have any samples? And how to do to prevent "0-support" classes from being shown. It seems that these two classes have a great impact to macro f1 score.
You can use the following snippet to ensure that all labels in the classification report are present in y_true
labels:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1, 42]
print(classification_report(y_true, y_pred, labels=np.unique(y_true)))
Which output:
precision recall f1-score support
0 0.50 1.00 0.67 1
1 0.00 0.00 0.00 1
2 1.00 0.50 0.67 4
micro avg 0.60 0.50 0.55 6
macro avg 0.50 0.50 0.44 6
weighted avg 0.75 0.50 0.56 6
As you see the label 42
present in the prediction is not shown as it has no support in y_true
.