Search code examples
pythonpytorchtorch

pytorch derivative returns none on .grad


i1 = tr.tensor(0.0, requires_grad=True)
i2 = tr.tensor(0.0, requires_grad=True)
x =  tr.tensor(2*(math.cos(i1)*math.cos(i2) - math.sin(i1)*math.sin(i2)) + 3*math.cos(i1),requires_grad=True)
y =  tr.tensor(2*(math.sin(i1)*math.cos(i2) + math.cos(i1)*math.sin(i2)) + 3*math.sin(i1),requires_grad=True)
    
z = (x - (-2))**2 + (y - 3)**2
z.backward()
dz_t1 = i1.grad
dz_t2 = i2.grad
print(dz_t1)
print(dz_t2)

im trying to run the following code, but im facing an issue after z.backward(). i1.grad and i1.grad return none. from what i understand the cause of this issue is with the way backward() is evaluated in torch. so something along the lines of i1.retain_grad() has to be used to avoid this issue, i tried doing that but i still get none. i1.retain_grad and i2.retain_grad() were placed before z.backward() and after z.backward() and i still get none as an answer. whats happening exactly and how do i fix it? y.grad and x.grad work fine.


Solution

  • Use:

    i1 = tr.tensor(0.0, requires_grad=True)
    i2 = tr.tensor(0.0, requires_grad=True)
    x =  2*(torch.cos(i1)*torch.cos(i2) - torch.sin(i1)*torch.sin(i2)) + 3*torch.cos(i1)
    y =  2*(torch.sin(i1)*torch.cos(i2) + torch.cos(i1)*torch.sin(i2)) + 3*torch.sin(i1)
    z = (x - (-2))**2 + (y - 3)**2
    z.backward()
    dz_t1 = i1.grad
    dz_t2 = i2.grad
    print(dz_t1)
    print(dz_t2)
    

    Here, using torch.sin and torch.cos ensures that the output is a torch tensor that is connected to i1 and i2 in the computational graph. Also, creating x and y using torch.tensor like you did detaches them from the existing graph, which again prevents gradients from flowing back through to i1 and i2.