Search code examples
pythontensorflowneural-networkderivativeactivation-function

what activation function should I use to enforce rounding like behaviour


I need an activation function that rounds my tensors.

the derivative(gradients) of the function round() is 0 (or None in tensorflow) which makes it unusable as an activation function.

I am looking for a function that enforce rounding-like behaviour so that the results of my model dont just approximate a number. (as my labels are integers)

I know that the formulae: tanh ○ sigmoid was used to enforce {-1, 0, 1} numbers only flowing through the model so is there some combination of function that are derivable that simulate rounding behaviour?


Solution

  • If you'd like to approximate round on the real line, you can do something like the following:

    def approx_round(x, steepness=1):
        floor_part = tf.floor(x)
        remainder = tf.mod(x, 1)
        return floor_part + tf.sigmoid(steepness*(remainder - 0.5))
    

    There are, in fact, ways to register your own gradients in Tensorflow (see, for example, this question). However, I am not as familiar on achieving this part, as I don't use Keras/TensorFlow that often.

    In terms of a function that would give you the gradient of this approximation, it would be the following:

    def approx_round_grad(x, steepness=1):
        remainder = tf.mod(x, 1)
        sig = tf.sigmoid(steepness*(remainder - 0.5))
        return sig*(1 - sig)
    

    To be clear, this approximation assumes you're using a "steep enough" steepness parameter, since the sigmoid function doesn't go to exactly 0 or 1, except in the limit of large arguments.

    To do something like the half sin approximation, you could use the following:

    def approx_round_sin(x, width=0.1):
        if width > 1 or width <= 0:
            raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
        floor_part = tf.floor(x)
        remainder = tf.mod(x, 1)
        return (floor_part + clipped_sin(remainder, width))
    
    def clipped_sin(x, width):
        half_width = width/2
        sin_part = (1 + tf.sin(np.pi*((x-0.5)/width)))/2
        whole = sin_part*tf.cast(tf.abs(x - 0.5) < half_width, tf.float32)
        whole += tf.cast(x > 0.5 + half_width, tf.float32)
        return whole
    
    def approx_round_grad_sin(x, width=0.1):
        if width > 1 or width <= 0:
            raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
        remainder = tf.mod(x, 1)
        return clipped_cos(remainder, width)
    
    def clipped_cos(x, width):
        half_width = width/2
        cos_part = np.pi*tf.cos(np.pi*((x-0.5)/width))/(2*width)
        return cos_part*tf.cast(tf.abs(x - 0.5) < half_width, dtype=tf.float32)