Search code examples
neural-networkpytorchparameter-passinggradient-descentbackpropagation

Having a network output as another network parameters


I have y = Y(x;theta) and theta = M(t;omega), where x and t are input variables (given from the dataset), and theta and omega trainable parameters. I need to have theta as function of omega. Then, I have a loss function over y and need to backpropagate the gradient through M up to Y. How can I create such a structure in pytorch?

Currently, my network is built as follows (sizes is a list of integers, defined as sizes = [input_size, hidden1_size, hidden2_size, ..., output_size])

import torch
import torch.nn as nn
import torch.nn.functional as F

class M(nn.Module):
    def __init__(self, sizes):
        super(Y, self).__init__()
        self.layers = nn.ModuleList()
        for i in range(0, len(sizes) - 1):
            self.layers.append(nn.Linear(sizes[i], sizes[i+1]))

    def forward(self, x):
        for l in self.layers[:-1]:
            x = F.relu(l(x))
        x = self.layers[-1](x)

        return x

Solution

  • I think it is quite simple or I didn't get your query correctly.

    x, t are your input variables.

    Now let us define a network M that will take input t and output theta.

    M = nn.Sequential(....) # declare network here
    

    Next, we define a network Y. This here might be tricky as you want to use theta as parameters. It might be easier and intuitive to work with functional counterparts of the modules declared in nn (see https://pytorch.org/docs/stable/nn.functional.html). I will try to give an example of this assuming theta are params of a linear module.

    class Y(nn.Module):
        def __init__(self):
            # declare any modules here
    
        def forward(self, theta, x):
            return nn.functional.linear(input=x, weight=theta, bias=None)
    

    The overall forward pass would be

    def forward(t, x, M, Y):
        theta = M(t)
        output = Y(theta, x)
        return output