I have a combined loss funcation like this:
loss = alpha * loss0 + beta* loss1 + gamma * loss2 + delta* loss3
I would like to make alpha, beta, gamma, and delta learnable parameters. Notice that alpha, beta, gamma, and delta are outside the nn.Module. How can I do that?
Loss scaling factors are hyperparameters. They need to be set outside the learning loop - they cannot be learned.
The reason for this is the model can trivially achieve zero loss by ignoring the actual loss term and instead setting the scaling term to 0 or a large negative number.
The coefficients alpha
, beta
, etc cannot be part of the loss optimization itself. Look into hyperparameter tuning for methods on selecting and evaluating different hyperparameters outside the loss optimization loop.