Search code examples
pythondeep-learningpytorchtorchtorchvision

What is running loss in PyTorch and how is it calculated


I had a look at this tutorial in the PyTorch docs for understanding Transfer Learning. There was one line that I failed to understand.

After the loss is calculated using loss = criterion(outputs, labels), the running loss is calculated using running_loss += loss.item() * inputs.size(0) and finally, the epoch loss is calculated using running_loss / dataset_sizes[phase].

Isn't loss.item() supposed to be for an entire mini-batch (please correct me if I am wrong). i.e, if the batch_size is 4, loss.item() would give the loss for the entire set of 4 images. If this is true, why is loss.item() being multiplied with inputs.size(0) while calculating running_loss? Isn't this step like an extra multiplication in this case?

Any help would be appreciated. Thanks!


Solution

  • It's because the loss given by CrossEntropy or other loss functions is divided by the number of elements i.e. the reduction parameter is mean by default.

    torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')
    

    Hence, loss.item() contains the loss of entire mini-batch, but divided by the batch size. That's why loss.item() is multiplied with batch size, given by inputs.size(0), while calculating running_loss.