Search code examples
pythonpytorchautogradautomatic-differentiation

Torch automatic differentiation for matrix defined with components of vector


The title is quite self-explanatory. I have the following

import torch

x = torch.tensor([3., 4.], requires_grad=True)
A = torch.tensor([[x[0], x[1]],
                  [x[1], x[0]]], requires_grad=True)

f = torch.norm(A)
f.backward()

I would like to compute the gradient of f with respect to x, but if I type x.grad I just get None. If I use the more explicit command torch.autograd.grad(f, x) instead of f.backward(), I get

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.


Solution

  • The problem might be, that when you take a slice of a leaf tensor, it returns a non-leaf tensor like so:

    >>> x.is_leaf
    True
    >>> x[0].is_leaf
    False
    

    So what's happening is that x is not what was added to the graph, but instead x[0].

    Try this instead:

    >>> import torch
    >>> 
    >>> x = torch.tensor([3., 4.], requires_grad=True)
    >>> xt = torch.empty_like(x).copy_(x.flip(0))
    >>> A = torch.stack([x,xt])
    >>> 
    >>> f = torch.norm(A)
    >>> f.backward()
    >>> 
    >>> x.grad
    tensor([0.8485, 1.1314])
    

    The difference is that PyTorch knows to add x to the graph, so f.backward() populates it's gradient. Here you'll find a few different way of copying tensors and the effect it has on the graph.