Search code examples
tensorflowgradientderivativeactivationrelu

How to simulate ReLU gradient with tf.GradientTape


TensorFlow has a feature called GradientTape, kinda getting gradients using Monte Carlo method(?).

I'm trying to simulate the gradient of ReLU but this doesn't work on the negative half of X.

#colab or ipython reset
%reset -f

#libs
import tensorflow as tf;

#init
tf.enable_eager_execution();

#code
x = tf.convert_to_tensor([-3,-2,-1,0,1,2,3],dtype=tf.float32);

with tf.GradientTape() as t:
  t.watch(x);
  y = fx = x; #THIS IS JUST THE POSITIVE HALF OF X

dy_dx = t.gradient(y,x);
print(dy_dx); 

Guess I have to change something at the line y = fx = x, like adding a if x<=0 but can't figure out how.

The above code prints out:

tf.Tensor([1. 1. 1. 1. 1. 1. 1.], shape=(7,), dtype=float32)

But it is wanted to be:

tf.Tensor([0. 0. 0. 0. 1. 1. 1.], shape=(7,), dtype=float32)

Solution

  • The following grad function simulate the conditional X of ReLU function but I don't know whether it's the proposed, suggested way to do it:

    #ipython
    %reset -f
    
    #libs
    import tensorflow as tf;
    import numpy      as np;
    
    #init
    tf.enable_eager_execution();
    
    #code
    X = tf.convert_to_tensor([-3,-2,-1,0,1,2,3], dtype=tf.float32);
    
    with tf.GradientTape() as T:
      T.watch(X);
      Y = Fx = X;
    #end with
    
    Dy_Dx = T.gradient(Y,X);
    #print(Dy_Dx);
    
    #get gradient of function Fx with conditional X
    def grad(Y,At):
      if (At<=0): return 0;
    
      for I in range(len(X)):
        if X[I].numpy()==At:
          return Dy_Dx[I].numpy();
    #end def
    
    print(grad(Y,-3));
    print(grad(Y,-2));
    print(grad(Y,-1));
    print(grad(Y,-0));
    print(grad(Y,1));
    print(grad(Y,2));
    print(grad(Y,3));
    
    print("\nDone.");
    #eof