Search code examples
pythonneural-networkpytorchsoftmaxcross-entropy

Suppress use of Softmax in CrossEntropyLoss for PyTorch Neural Net


I know theres no need to use a nn.Softmax() Function in the output layer for a neural net when using nn.CrossEntropyLoss as a loss function.

However I need to do so, is there a way to suppress the implemented use of softmax in nn.CrossEntropyLoss and instead use nn.Softmax() on my output layer of the neural network itself?

Motivation: I am using shap package to analyze the features influences afterwards, where I can only feed my trained model as an input. The outputs however don't make any sense then because I am looking at unbound values instead of probabilites.

Example: Instead of -69.36 as an output value for one class of my model, I want something between 0 and 1, summing up to 1 for all classes. As I can't alter it afterwards, the outputs need to be like this already during training.

enter image description here


Solution

  • The documentation of nn.CrossEntropyLoss says,

    This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.

    I suggest you stick to the use of CrossEntropyLoss as the loss criterion. However, you can convert the output of your model into probability values by using the softmax function.

    Please note, you can always play with the output values of your model, you do not need to change the loss criterion for that.

    But if you still want to use Softmax() in your network, then you can use the NLLLoss() as the loss criterion, only apply log() before feeding model's output to the criterion function. Similarly, if you use LogSoftmax instead in your network, you can apply exp() to get the probability values.

    Update:

    To use log() on the Softmax output, please do:

    torch.log(prob_scores + 1e-20)
    

    By adding a very small number (1e-20) to the prob_scores, we can avoid the log(0) issue.