Search code examples
pytorchtensortorchderivative

Matrix partial derivative with Torch


I have a function as f(t,k)=tk+2tk^2. I want to get partial derivative respect to k for t=[1,2] and k=[2,3]. I Have created a matrix of tensors as follows: gamma= [10,21 20,42]

when I try to get derivative of gamma respect to k I expect a matrix 2*2 but torch returns a vector instead. what is the problem?

import torch
import numpy as np

t=torch.tensor([1.0], requires_grad=True)
k=torch.tensor([1.0], requires_grad=True)
gamma=torch.zeros((2,2))

def f(a,b): #c is a coeff vector
    global t,k,gamma
    t=torch.tensor(a, dtype=float,requires_grad=True)
    k=torch.tensor(b, dtype=float,requires_grad=True)
    for i in range(2):
        for j in range(2):
            gamma[i,j]=torch.mul(t[i],k[j])+torch.mul(2,torch.mul(t[i],torch.pow(k[j],2)))

a=[1,2]
b=[2,3]
f(a,b)

def dy_dx(y, x):
    return torch.autograd.grad(y, x, grad_outputs=torch.ones_like(y),allow_unused=True, create_graph=True)
    
print(dy_dx(gamma,k))

I tried to get the derivative of f(t,k) respect to k for t=[1,2] and k=[2,2]. I expected a matrix as follows: [9,13 18,26]

but torch returns a vector as follows: [27., 39.]


Solution

  • According to the PyTorch documentation, the grad function:

    Computes and returns the sum of gradients of outputs with respect to the inputs.

    In order to compute the derivative of a function output with respect to its parameters you should use the jacobian function. The following snippet computes the first derivative of the function f(t, k) w.r.t. t and k:

    import torch
    
    def f(t, k):  
      return t * k + 2 * t *  k ** 2
    
    t = torch.tensor([1., 2., 4.], requires_grad=True)
    k = torch.tensor([2., 3.], requires_grad=True)
    T, K = torch.meshgrid(t, k, indexing='ij')
    
    gamma = f(T,K)
    
    jacobian_t, jacobian_k = torch.autograd.functional.jacobian(f, (T, K))
    
    dfdt = jacobian_t[jacobian_t>0].reshape(len(t), len(k))
    dfdk = jacobian_k[jacobian_k>0].reshape(len(t), len(k))
    
    print(gamma)
    print(dfdt)
    print(dfdk)
    

    The output:

    tensor([[10., 21.],
            [20., 42.],
            [40., 84.]], grad_fn=<AddBackward0>)
    
    tensor([[10., 21.],
            [10., 21.],
            [10., 21.]])
    
    tensor([[ 9., 13.],
            [18., 26.],
            [36., 52.]])
    

    If you are looking for a more efficient calculation of the jacobian pass the vectorize=True parameter to this function. In addition, this function supports both forward and reverse mode automatic differentiation.