Search code examples
keras

Implementing a complicated activation function in keras


I just read an interesting paper: A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks.

I'd like to try to implement this activation function in Keras. I've implemented custom activations before, e.g. a sinusoidal activation:

def sin(x):
  return K.sin(x)
get_custom_objects().update({'sin': Activation(sin)})

However, the activation function in this paper has 3 unique properties:

  1. It doubles the size of the input (the output is 2x the input)
  2. It's parameterized
  3. It's parameters should be regularized

I think once I have a skeleton for dealing with the above 3 issues, I can work out the math myself, but I'll take any help I can get!


Solution

  • Here, we will need one of these two:

    • A Lambda layer - If your parameters are not trainable (you don't want them to change with backpropagation)
    • A custom layer - If you need custom trainable parameters.

    The Lambda layer:

    If your parameters are not trainable, you can define your function for a lambda layer. The function takes one input tensor, and it can return anything you want:

    import keras.backend as K
    
    def customFunction(x):
    
        #x can be either a single tensor or a list of tensors
        #if a list, use the elements x[0], x[1], etc.
    
        #Perform your calculations here using the keras backend
        #If you could share which formula exactly you're trying to implement, 
            #it's possible to make this answer better and more to the point    
    
        #dummy example
        alphaReal = K.variable([someValue])    
        alphaImag = K.variable([anotherValue]) #or even an array of values   
    
        realPart = alphaReal * K.someFunction(x) + ... 
        imagPart = alphaImag * K.someFunction(x) + ....
    
        #You can return them as two outputs in a list (requires the fuctional API model
        #Or you can find backend functions that join them together, such as K.stack
    
        return [realPart,imagPart]
    
        #I think the separate approach will give you a better control of what to do next. 
    

    For what you can do, explore the backend functions.

    For the parameters, you can define them as keras constants or variables (K.constant or K.variable), either inside or outside the function above, or even transform them in model inputs. See details in this answer

    In your model, you just add a lambda layer that uses that function.

    • In a Sequential model: model.add(Lambda(customFunction, output_shape=someShape))
    • In a functional API model: output = Lambda(customFunction, ...)(inputOrListOfInputs)

    If you're going to pass more inputs to the function, you'll need the functional model API.
    If you're using Tensorflow, the output_shape will be computed automatically, I believe only Theano requires it. (Not sure about CNTK).

    The custom layer:

    A custom layer is a new class you create. This approach will only be necessary if you're going to have trainable parameters in your function. (Such as: optimize alpha with backpropagation)

    Keras teaches it here.

    Basically, you have an __init__ method where you pass the constant parameters, a build method where you create the trainable parameters (weights), a call method that will do the calculations (exactly what would go in the lambda layer if you didn't have trainable parameters), and a compute_output_shape method so you can tell the model what the output shape is.

    class CustomLayer(Layer):
    
        def __init__(self, alphaReal, alphaImag):
    
            self.alphaReal = alphaReal    
            self.alphaImage = alphaImag
    
        def build(self,input_shape):
    
            #weights may or may not depend on the input shape
            #you may use it or not...
    
            #suppose we want just two trainable values:
            weigthShape = (2,)
    
            #create the weights:
            self.kernel = self.add_weight(name='kernel', 
                                      shape=weightShape,
                                      initializer='uniform',
                                      trainable=True)
    
            super(CustomLayer, self).build(input_shape)  # Be sure to call this somewhere!
    
        def call(self,x):
    
            #all the calculations go here:
    
            #dummy example using the constant inputs
            realPart = self.alphaReal * K.someFunction(x) + ... 
            imagPart = self.alphaImag * K.someFunction(x) + ....
    
            #dummy example taking elements of the trainable weights
            realPart = self.kernel[0] * realPart    
            imagPart = self.kernel[1] * imagPart
    
            #all the comments for the lambda layer above are valid here
    
            #example returning a list
            return [realPart,imagPart]
    
        def compute_output_shape(self,input_shape):
    
            #if you decide to return a list of tensors in the call method, 
            #return a list of shapes here, twice the input shape:
            return [input_shape,input_shape]    
    
            #if you stacked your results somehow in a single tensor, compute a single tuple, maybe with an additional dimension equal to 2:
            return input_shape + (2,)