Search code examples
pythontensorflowgradientautodiff

Defining custom gradient as a class method in Tensorflow


I need to define a method to be a custom gradient as follows:

class CustGradClass:

    def __init__(self):
        pass

    @tf.custom_gradient
    def f(self,x):
      fx = x
      def grad(dy):
        return dy * 1
      return fx, grad

I am getting the following error:

ValueError: Attempt to convert a value (<main.CustGradClass object at 0x12ed91710>) with an unsupported type () to a Tensor.

The reason is the custom gradient accepts a function f(*x) where x is a sequence of Tensors. And the first argument being passed is the object itself i.e., self.

From the documentation:

f: function f(*x) that returns a tuple (y, grad_fn) where:
x is a sequence of Tensor inputs to the function. y is a Tensor or sequence of Tensor outputs of applying TensorFlow operations in f to x. grad_fn is a function with the signature g(*grad_ys)

How do I make it work? Do I need to inherit some python tensorflow class?

I am using tf version 1.12.0 and eager mode.


Solution

  • This is one possible simple workaround:

    import tensorflow as tf
    
    class CustGradClass:
    
        def __init__(self):
            self.f = tf.custom_gradient(lambda x: CustGradClass._f(self, x))
    
        @staticmethod
        def _f(self, x):
            fx = x * 1
            def grad(dy):
                return dy * 1
            return fx, grad
    
    with tf.Graph().as_default(), tf.Session() as sess:
        x = tf.constant(1.0)
        c = CustGradClass()
        y = c.f(x)
        print(tf.gradients(y, x))
        # [<tf.Tensor 'gradients/IdentityN_grad/mul:0' shape=() dtype=float32>]
    

    EDIT:

    If you want to do this a lot of times on different classes, or just want a more reusable solution, you can use some decorator like this for example:

    import functools
    import tensorflow as tf
    
    def tf_custom_gradient_method(f):
        @functools.wraps(f)
        def wrapped(self, *args, **kwargs):
            if not hasattr(self, '_tf_custom_gradient_wrappers'):
                self._tf_custom_gradient_wrappers = {}
            if f not in self._tf_custom_gradient_wrappers:
                self._tf_custom_gradient_wrappers[f] = tf.custom_gradient(lambda *a, **kw: f(self, *a, **kw))
            return self._tf_custom_gradient_wrappers[f](*args, **kwargs)
        return wrapped
    

    Then you could just do:

    class CustGradClass:
    
        def __init__(self):
            pass
    
        @tf_custom_gradient_method
        def f(self, x):
            fx = x * 1
            def grad(dy):
                return dy * 1
            return fx, grad
    
        @tf_custom_gradient_method
        def f2(self, x):
            fx = x * 2
            def grad(dy):
                return dy * 2
            return fx, grad