Search code examples
kerasscikit-learnmultilabel-classificationprecision-recall

Is this the correct use of sklearn classification report for multi-label classification reports?


I am training a neural network with tf-keras. It is a multi-label classification where each sample belongs to multiple classes [1,0,1,0..etc] .. the final model line (just for clarity) is:

model.add(tf.keras.layers.Dense(9, activation='sigmoid'))#final layer

model.compile(loss='binary_crossentropy', optimizer=optimizer, 
                metrics=[tf.keras.metrics.BinaryAccuracy(), 
                tfa.metrics.F1Score(num_classes=9, average='macro',threshold=0.5)])

I need to generate precision, recall and F1 scores for these (I already get the F1 score reported during training). For this I am using sklearns classification report, but I need to confirm that I am using it correctly in the multi-label setting.

from sklearn.metrics import classification_report

pred = model.predict(x_test)
pred_one_hot = np.around(pred)#this generates a one hot representation of predictions

print(classification_report(one_hot_ground_truth, pred_one_hot))

This works fine and i get the full report for every class including F1 scores that match the F1score metric from tensorflow addons (for macro F1). Sorry this post is verbose but what I am unsure about is:

Is it correct that the predictions need to be one-hot encoded in the case of the multi-label setting? If I pass in the normal prediction scores (sigmoid probabilities) an error is thrown...

thank you.


Solution

  • It is correct to use classification_report for both binary, multi-class and multi-label classification.

    The labels are not one-hot-encoded in case of multi-class classification. They simply need to be either indices or labels.

    You can see that both code below yield the same output:

    Example with indices

    from sklearn.metrics import classification_report
    import numpy as np
    
    labels = np.array(['A', 'B', 'C'])
    
    
    y_true = np.array([1, 2, 0, 1, 2, 0])
    y_pred = np.array([1, 2, 1, 1, 1, 0])
    print(classification_report(y_true, y_pred, target_names=labels))
    

    Example with labels

    from sklearn.metrics import classification_report
    import numpy as np
    
    labels = np.array(['A', 'B', 'C'])
    
    y_true = labels[np.array([1, 2, 0, 1, 2, 0])]
    y_pred = labels[np.array([1, 2, 1, 1, 1, 0])]
    print(classification_report(y_true, y_pred))
    

    Both returns

                  precision    recall  f1-score   support
    
               A       1.00      0.50      0.67         2
               B       0.50      1.00      0.67         2
               C       1.00      0.50      0.67         2
    
        accuracy                           0.67         6
       macro avg       0.83      0.67      0.67         6
    weighted avg       0.83      0.67      0.67         6
    

    In the context of multi-label classification, classification_report can be used like in the example below:

    from sklearn.metrics import classification_report
    import numpy as np
    
    labels =['A', 'B', 'C']
    
    y_true = np.array([[1, 0, 1],
                       [0, 1, 0],
                       [1, 1, 1]])
    y_pred = np.array([[1, 0, 0],
                       [0, 1, 1],
                       [1, 1, 1]])
    
    print(classification_report(y_true, y_pred, target_names=labels))