python deep-learning pytorch torch torchvision

What is running loss in PyTorch and how is it calculated

I had a look at this tutorial in the PyTorch docs for understanding Transfer Learning. There was one line that I failed to understand.

After the loss is calculated using loss = criterion(outputs, labels), the running loss is calculated using running_loss += loss.item() * inputs.size(0) and finally, the epoch loss is calculated using running_loss / dataset_sizes[phase].

Isn't loss.item() supposed to be for an entire mini-batch (please correct me if I am wrong). i.e, if the batch_size is 4, loss.item() would give the loss for the entire set of 4 images. If this is true, why is loss.item() being multiplied with inputs.size(0) while calculating running_loss? Isn't this step like an extra multiplication in this case?

Any help would be appreciated. Thanks!

Solution

It's because the loss given by CrossEntropy or other loss functions is divided by the number of elements i.e. the reduction parameter is mean by default.

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

Hence, loss.item() contains the loss of entire mini-batch, but divided by the batch size. That's why loss.item() is multiplied with batch size, given by inputs.size(0), while calculating running_loss.