Search code examples

matrix determinant differentiation in tensorflow

I am interested in computing the derivative of a matrix determinant using TensorFlow. I can see from experimentation that TensorFlow has not implemented a method of differentiating through a determinant:

LookupError: No gradient defined for operation 'MatrixDeterminant' 
(op type: MatrixDeterminant)

A little further investigation revealed that it is actually possible to compute the derivative; see for example Jacobi's formula. I determined that in order to implement this means of differentiating through a determinant that I need to use the function decorator,

def _sub_grad(op, grad):

However, I am not familiar enough with tensor flow to understand how this can be accomplished. Does anyone have any insight on this matter?

Here's an example where I run into this issue:

x = tf.Variable(tf.ones(shape=[1]))
y = tf.Variable(tf.ones(shape=[1]))

A = tf.reshape(
    tf.pack([tf.sin(x), tf.zeros([1, ]), tf.zeros([1, ]), tf.cos(y)]), (2,2)
loss = tf.square(tf.matrix_determinant(A))

optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)

init = tf.initialize_all_variables()
sess = tf.Session()

for step in xrange(100):


  • Please check "Implement Gradient in Python" section here

    In particular, you can implement it as follows

    def _MatrixDeterminantGrad(op, grad):
      """Gradient for MatrixDeterminant. Use formula from 2.2.4 from
      An extended collection of matrix derivative results for forward and reverse
      mode algorithmic differentiation by Mike Giles
      A = op.inputs[0]
      C = op.outputs[0]
      Ainv = tf.matrix_inverse(A)
      return grad*C*tf.transpose(Ainv)

    Then a simple training loop to check that it works:

    a0 = np.array([[1,2],[3,4]]).astype(np.float32)
    a = tf.Variable(a0)
    b = tf.square(tf.matrix_determinant(a))
    init_op = tf.initialize_all_variables()
    sess = tf.InteractiveSession()
    minimization_steps = 50
    learning_rate = 0.001
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    train_op = optimizer.minimize(b)
    losses = []
    for i in range(minimization_steps):

    Then you can visualize your loss over time

    import matplotlib.pyplot as plt
    plt.ylabel("Determinant Squared")

    Should see something like this Loss plot