How can I efficiently compute the Jacobian matrix in TensorFlow?

I want to calculate the Jacobian matrix by TensorFlow.

I have:

def compute_grads(fn, vars, data_num):
    grads = []
    for n in range(0, data_num):
        for v in vars:
            grads.append(tf.gradients(tf.slice(fn, [n, 0], [1, 1]), v)[0])
    return tf.reshape(tf.stack(grads), shape=[data_num, -1])

fn is a loss function, vars are all trainable variables, and data_num is a number of data.

But if we increase the number of data, it takes a tremendous time to run the function compute_grads. How can I fix this?

Solution

Assuming that X and Y are TensorFlow tensors and that Y depends on X:

from tensorflow.python.ops.parallel_for.gradients import jacobian

J = jacobian(Y, X)

The result has the shape Y.shape + X.shape and provides the partial derivative of each element of Y with respect to each element of X.