Search code examples
pytorchneural-networkgradient

Gradient of neural-network using pytorch


Let's say i define a neural-network; M : R^2 x [Net_params] --> R^2; y = M(x,theta) I need a way to get the gradients evaluated at an specific input: dM/dx|x=x_0 and dM/d_theta|x=x_0

I want to use PyTorch for the implementation

class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(2, 20, dtype=float),
            nn.ReLU(),
            nn.Linear(20, 20, dtype=float),
            nn.ReLU(),
            nn.Linear(20, 2, dtype=float)
        )

    def forward(self, x):
        return self.linear_relu_stack(x)

M = NeuralNetwork().to(device)

is there something like ~

input = torch.tensor([1,0])
y = M(input)
y.backward()
grad_input = input.grad
grad_params = y.params.grad

I'm aware the above code is garbage but I'm looking for something like this.

I tried to take the gradient of an network using backward, but this seems not to work, y needs to be some kind of scalar, which is weird.


Solution

  • By dM/dx and dM/d_theta, you probably mean dy/dx and dy/d_theta.

    You need to backpropagate on scalar outputs, for example:

    >>> y.mean().backward()
    

    Then you will have access to dy/d_in where in is (x, theta):

    >>> input.grad
    

    You can read more about backpropagation in PyTorch on this thread.