Search code examples
neural-networkgradientpytorchbackpropagation

Auto updating custom layer parameters while backpropagating in pytorch


I have a pytorch custom layer defined as:

class MyCustomLayer(nn.Module):
  def __init__(self):
    super(MyCustomLayer, self).__init__()

    self.my_parameter = torch.rand(1, requires_grad = True)

    # the following allows the previously defined parameter to be recognized as a network parameter when instantiating the model
    self.my_registered_parameter = nn.ParameterList([nn.Parameter(self.my_parameter)])

  def forward(self, x):
    return x*self.my_parameter

I then define my network where the custom layer is used:

class MyNet(nn.Module):
  def __init__(self):
    super(MyNet, self).__init__()
    self.layer1 = MyCustomLayer()

  def forward(self, x):
    x = self.layer1(x)
    return x

Now Let's instantiate MyNet and observe the issue:

# instantiate MyNet and run it over one input value
model = MyNet()
x = torch.tensor(torch.rand(1))
output = model(x)
criterion = nn.MSELoss()
loss = criterion(1, output)
loss.backward()

Iterating through model parameters shows None for custom layer parameter:

for p in model.parameters():
    print (p.grad)

None

while directly accessing that parameter shows the correct grad value:

print(model.layer1.my_parameter.grad)

tensor([-1.4370])

This, in turn, prevents the optim step from updating the inner parameters automatically and leaves me with the hassle of having to update those manually. Anyone knows how I can address this issue?


Solution

  • What you did i.e. return x*self.my_registered_parameter[0] worked because you use the registered param for calculating the gradient.

    When you call nn.Parameter it returns a new object and hence self.my_parameter that you use for the operation and the one registered are not same.

    You can fix this by declaring the my_parameter as nn.Parameter

    self.my_parameter = nn.Parameter(torch.rand(1, requires_grad = True))
    self.my_registered_parameter= nn.ParameterList([self.some_parameter])
    

    or you don't need to create my_registered_parameter variable at all. When you declare self.my_parameter as nn.Parameter it gets registered as a parameter.