Search code examples
deep-learningneural-networkpytorchloss-function

Understanding Loss Functions


I was following this pytorch tutorial on how to setup a neural network, but I don't understand the loss function, loss = (y_pred - y).pow(2).sum().item(). Why is this being used rather than the derivative of the function used to calculate the predicted y value? I also don't understand what that function returns.


Solution

  • That function is the euclidean L2-norm. It returns the sum of the squared errors between network output and expected output.

    As for the derivative of the function, or better its gradient, it it computed internally by the deep learning framework you are using (here pytorch I assume) and is needed to update the network parameter. For most use cases, you do not need to think at it. Its computation is totally automatic.

    One note: if you call .item() on a tensor, you are extracting its raw value, i.e. what you get is no more a tensor but just a number. This means that you cannot compute the gradient from it (call .backward()).