I know theres no need to use a nn.Softmax()
Function in the output layer for a neural net when using nn.CrossEntropyLoss
as a loss function.
However I need to do so, is there a way to suppress the implemented use of softmax in nn.CrossEntropyLoss
and instead use nn.Softmax()
on my output layer of the neural network itself?
Motivation: I am using shap
package to analyze the features influences afterwards, where I can only feed my trained model as an input. The outputs however don't make any sense then because I am looking at unbound values instead of probabilites.
Example: Instead of -69.36 as an output value for one class of my model, I want something between 0 and 1, summing up to 1 for all classes. As I can't alter it afterwards, the outputs need to be like this already during training.
The documentation of nn.CrossEntropyLoss says,
This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.
I suggest you stick to the use of CrossEntropyLoss
as the loss criterion. However, you can convert the output of your model into probability values by using the softmax function.
Please note, you can always play with the output values of your model, you do not need to change the loss criterion for that.
But if you still want to use Softmax()
in your network, then you can use the NLLLoss()
as the loss criterion, only apply log() before feeding model's output to the criterion function. Similarly, if you use LogSoftmax
instead in your network, you can apply exp() to get the probability values.
Update:
To use log()
on the Softmax
output, please do:
torch.log(prob_scores + 1e-20)
By adding a very small number (1e-20) to the prob_scores
, we can avoid the log(0)
issue.