Search code examples
pythonscikit-learnnamed-entity-recognitionprecision-recall

Same test and prediction values gives 0 precision, recall, f1 score for NER


I was using sklearns crfsuite to compute the f1, precision, and recall scores but there is an anomaly. For just testing purposes I gave the same test and prediction values.

from sklearn_crfsuite import scorers
from sklearn_crfsuite import metrics

cls = [i for i, _ in enumerate(CLASSES)]
cls.append(7)
cls.append(8)

print(metrics.flat_classification_report(
    test["y"], test["y"], labels=cls, digits=3
))
              precision    recall  f1-score   support

           0      1.000     1.000     1.000       551
           1      0.000     0.000     0.000         0
           2      0.000     0.000     0.000         0
           3      1.000     1.000     1.000      1196
           4      1.000     1.000     1.000      2593
           5      1.000     1.000     1.000     95200
           6      1.000     1.000     1.000      1165
           7      1.000     1.000     1.000      9636
           8      1.000     1.000     1.000    506363

   micro avg      1.000     1.000     1.000    616704
   macro avg      0.778     0.778     0.778    616704
weighted avg      1.000     1.000     1.000    616704

Why 1 and 2 labels are giving all 0 scores. It should give 1 as the rest of the data. Can anyone explain to me the reason?

Need help. Thanks in advance!


Solution

  • It seems that you don't actually have classes 1 and 2 in your data as the support of these two classes is zero, but since you have included classes 1 and 2 in the list of labels passed to flat_classification_report() they are still considered in the calculation of the various metrics.

    from sklearn_crfsuite import metrics
    import numpy as np
    np.random.seed(0)
    
    cmin = 0
    cmax = 8
    
    labels = np.arange(1 + cmax)
    print(np.unique(labels))
    # [0 1 2 3 4 5 6 7 8]
    
    y = np.random.randint(cmin, 1 + cmax, 1000).reshape(-1, 1)
    print(np.unique(y))
    # [0 1 2 3 4 5 6 7 8]
    
    # classification report when "y" takes on all the specified labels
    print(metrics.flat_classification_report(y_true=y, y_pred=y, labels=labels, digits=3))
    #               precision    recall  f1-score   support
    #            0      1.000     1.000     1.000       117
    #            1      1.000     1.000     1.000       106
    #            2      1.000     1.000     1.000       106
    #            3      1.000     1.000     1.000       132
    #            4      1.000     1.000     1.000       110
    #            5      1.000     1.000     1.000       115
    #            6      1.000     1.000     1.000       104
    #            7      1.000     1.000     1.000       109
    #            8      1.000     1.000     1.000       101
    #     accuracy                          1.000      1000
    #    macro avg      1.000     1.000     1.000      1000
    # weighted avg      1.000     1.000     1.000      1000
    
    # classification report when "y" takes on all the specified labels apart from 1 and 2,
    # but 1 and 2 are still included among the possible labels
    y = y[np.logical_and(y != 1, y != 2)].reshape(-1, 1)
    print(np.unique(y))
    # [0 3 4 5 6 7 8]
    
    print(metrics.flat_classification_report(y_true=y, y_pred=y, labels=labels, digits=3))
    #               precision    recall  f1-score   support
    #            0      1.000     1.000     1.000       117
    #            1      0.000     0.000     0.000         0
    #            2      0.000     0.000     0.000         0
    #            3      1.000     1.000     1.000       132
    #            4      1.000     1.000     1.000       110
    #            5      1.000     1.000     1.000       115
    #            6      1.000     1.000     1.000       104
    #            7      1.000     1.000     1.000       109
    #            8      1.000     1.000     1.000       101
    #    micro avg      1.000     1.000     1.000       788
    #    macro avg      0.778     0.778     0.778       788
    # weighted avg      1.000     1.000     1.000       788
    
    # classification report when "y" takes on all the specified labels apart from 1 and 2,
    # and 1 and 2 are not included among the possible labels
    labels = labels[np.logical_and(labels != 1, labels != 2)]
    print(np.unique(labels))
    # [0 3 4 5 6 7 8]
    
    print(metrics.flat_classification_report(y_true=y, y_pred=y, labels=labels, digits=3))
    #               precision    recall  f1-score   support
    #            0      1.000     1.000     1.000       117
    #            3      1.000     1.000     1.000       132
    #            4      1.000     1.000     1.000       110
    #            5      1.000     1.000     1.000       115
    #            6      1.000     1.000     1.000       104
    #            7      1.000     1.000     1.000       109
    #            8      1.000     1.000     1.000       101
    #     accuracy                          1.000       788
    #    macro avg      1.000     1.000     1.000       788
    # weighted avg      1.000     1.000     1.000       788