Search code examples
gradienttheanotensorflowbackpropagationautodiff

Can auto differentiation handle separate functions of array slices?


Given a vector v of length say 30, can auto differentiation tools in say theano or tensorflow be able to take the gradient of something like this:

x = np.random.rand(5, 1)
v = f(x, z)
w = v[0:25].reshape(5, 5)
y = g(np.matmul(w, x) + v[25:30])
minimize ( || y - x || )

Would this even make sense? The way I picture it in my mind I would have to do some multiplications by identity vectors/matrices with trailing 0's to convert v --> w


Solution

  • Slice and reshape operations fit into standard reverse mode AD framework in the same way as any other op. Below is a simple TensorFlow program that is similar to the example you gave (I had to change a couple of things to make dimensions match), and the resulting computation graph for the gradient

    def f(x, z):
      """Adds values together, reshapes into vector"""
      return tf.reshape(x+z, (5,))
    
    x = tf.Variable(np.random.rand(5, 1))
    z = tf.Variable(np.random.rand(5, 1))
    v = f(x, z)
    w = tf.slice(v, 0, 5)
    w = tf.reshape(v, (5, 1))
    y = tf.matmul(tf.reshape(w, (5, 1)), tf.transpose(x)) + tf.slice(v, 0, 5)
    cost = tf.square(tf.reduce_sum(y-x))
    print tf.gradients(cost, [x, z])
    

    enter image description here