Optimizing a neural network with a multi-task objective in Pytorch

In deep learning, you typically have an objective (say, image recognition), that you wish to optimize. In my field (natural language processing), though, we've seen a rise of multitask training. For instance, in next sentence prediction and sentence classification in a single system.

I understand how to build the forward pass, e.g. for a classification task (obj1) and a regression task (obj2)

class Net(nn.Module):
    def __init__():
        super().__init__()
        self.linear = Linear(300, 200)
        self.obj1 = Linear(200, 5)
        self.obj2 = Linear(200, 1)

    def forward(inputs):
        out = self.linear(inputs)
        out_obj1 = self.obj1(out)
        out_obj2 = self.obj2(out)
        return out_obj1, out_obj2

But the question then becomes, how does one optimize this. Do you call a backward pass over both losses separately? Or do you reduce them to a single loss (e.g. sum, average)? Is there an approach that is typically used for multi-task learning?

And to follow up on that, perhaps one could even argue that the parameters of the separate layers need different optimizers. In such case, the losses must be dealt with separately, I presume.

Solution

It is much simpler, you can optimize all variables at the same time without a problem. Just compute both losses with their respective criterions, add those in a single variable:

total_loss = loss_1 + loss_2

and calling .backward() on this total loss (still a Tensor), works perfectly fine for both. You could also weight the losses to give more importance to one rather than the other.

Check the PyTorch forums for more information.