Search code examples
pytorchclassificationcross-entropy

Difference between CrossEntropyLoss and NNLLoss with log_softmax in PyTorch?


When I am building a classifier in PyTorch, I have 2 options to do

  1. Using the nn.CrossEntropyLoss without any modification in the model
  2. Using the nn.NNLLoss with F.log_softmax added as the last layer in the model

So there are two approaches.

Now, what approach should anyone use, and why?


Solution

  • They're the same.

    If you check the implementation, you will find that it calls nll_loss after applying log_softmax on the incoming arguments.

    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
    

    Edit: seems like the links are now broken, here's the C++ implementation which shows the same information.