I am training a neural network with tf-keras. It is a multi-label classification where each sample belongs to multiple classes [1,0,1,0..etc] .. the final model line (just for clarity) is:
model.add(tf.keras.layers.Dense(9, activation='sigmoid'))#final layer
model.compile(loss='binary_crossentropy', optimizer=optimizer,
metrics=[tf.keras.metrics.BinaryAccuracy(),
tfa.metrics.F1Score(num_classes=9, average='macro',threshold=0.5)])
I need to generate precision, recall and F1 scores for these (I already get the F1 score reported during training). For this I am using sklearns classification report, but I need to confirm that I am using it correctly in the multi-label setting.
from sklearn.metrics import classification_report
pred = model.predict(x_test)
pred_one_hot = np.around(pred)#this generates a one hot representation of predictions
print(classification_report(one_hot_ground_truth, pred_one_hot))
This works fine and i get the full report for every class including F1 scores that match the F1score metric from tensorflow addons (for macro F1). Sorry this post is verbose but what I am unsure about is:
Is it correct that the predictions need to be one-hot encoded in the case of the multi-label setting? If I pass in the normal prediction scores (sigmoid probabilities) an error is thrown...
thank you.
It is correct to use classification_report
for both binary, multi-class and multi-label classification.
The labels are not one-hot-encoded in case of multi-class classification. They simply need to be either indices
or labels
.
You can see that both code below yield the same output:
Example with indices
from sklearn.metrics import classification_report
import numpy as np
labels = np.array(['A', 'B', 'C'])
y_true = np.array([1, 2, 0, 1, 2, 0])
y_pred = np.array([1, 2, 1, 1, 1, 0])
print(classification_report(y_true, y_pred, target_names=labels))
Example with labels
from sklearn.metrics import classification_report
import numpy as np
labels = np.array(['A', 'B', 'C'])
y_true = labels[np.array([1, 2, 0, 1, 2, 0])]
y_pred = labels[np.array([1, 2, 1, 1, 1, 0])]
print(classification_report(y_true, y_pred))
Both returns
precision recall f1-score support
A 1.00 0.50 0.67 2
B 0.50 1.00 0.67 2
C 1.00 0.50 0.67 2
accuracy 0.67 6
macro avg 0.83 0.67 0.67 6
weighted avg 0.83 0.67 0.67 6
In the context of multi-label classification, classification_report
can be used like in the example below:
from sklearn.metrics import classification_report
import numpy as np
labels =['A', 'B', 'C']
y_true = np.array([[1, 0, 1],
[0, 1, 0],
[1, 1, 1]])
y_pred = np.array([[1, 0, 0],
[0, 1, 1],
[1, 1, 1]])
print(classification_report(y_true, y_pred, target_names=labels))