I have two models whose performance I am comparing, a ResNet-9 model and a VGG-16 model. They are being used for image classification. Their accuracies on the same test set are:
ResNet-9 = 99.25% accuracy
However from their loss curves during training, shown in the images, I see that the VGG-16 has lower losses as compared to ResNet-9 which has higher losses.
I am using torch.nn.CrossEntropyLoss() for both VGG-16 and ResNet-9. I would have expected the ResNet-9 to have lower losses (because it performs better on the test set) but this is not the case.
Is this observation normal?
Yes, the loss for a model can be greater even if the accuracy is greater. This is because the loss function penalizes if the confidence is lower.
For example:
if you have a label [0, 0, 0, 1]
,
and model A predicts [0.1, 0.1, 0.1, 0.7]
and model B predicts [0.1, 0.0, 0.0, 0.9]
Both models will have an accuracy of 100% but the CrossEntropyLoss for model A will be greater than that of model B because model A predicts with a lower confidence score.
So, a higher loss of ResNet-9 compared to VGG-16, even when ResNet-9 has a greater accuracy, means ResNet-9's predictions (on average) have less confidence on the predicted labels.