Multiclass Classification: Should the number of output neurons should be equal to number of classes (My model has no softmax)

I am using Pytorch for my CNN. I trained the model with CrossEntropyLoss and Adam Optimizer. I have a dataset with 5 number of classes. The last layer of my model produces output shape = [Batch_Size, 400]. I am not using the Softmax function for this Classification. The model produced 97% accuracy. I can not post my model architecture here, for some reason.

My doubt is Is it it rule of thumb that the number of output neurons (400 in my case) should always be equal to the number of classes (5 in my case). If so, does this have to be followed even if the model has no softmax function? If not, my model is good, right?

Note: I did an extensive search about the topics mentioned. But they did not solve my problem, that is why I am posting it here to know.

Solution

You're wasting lots of computation time, throwing away most of your output by computing 400 numbers and only considering 5.

Pytorch's CrossEntropyLoss automatically applies a softmax (technically a logsoftmax) to the outputs of your neural network to compute the loss.

At inference time, I'd also recommend you apply a softmax to get more readable outputs (you can consider the softmax output to be a probability). To do so, you'd use: torch.nn.functional.softmax(y_predict[:, :5]).

You should consider retraining your model with the correct number of outputs, which will not hurt performance and will accelerate the model size and speed. You could also directly extract the weights you're using from the pretrained model, as such:

import torch
trained_layer = torch.nn.Linear(50, 400)

new_layer = torch.nn.Linear(50, 5)
new_layer._parameters["weight"] = trained_layer._parameters["weight"][:5, :].clone()
new_layer._parameters["bias"] = trained_layer._parameters["bias"][:5].clone()