Search code examples
tensorflowkeras

Keras remove activation function of last layer


I want to use ResNet50 with Imagenet weights.

The last layer of ResNet50 is (from here)

x = layers.Dense(1000, activation='softmax', name='fc1000')(x)

I need to keep the weights of this layer but remove the softmax function.

I want to manually change it so my last layer looks like this

x = layers.Dense(1000, name='fc1000')(x)

but the weights stay the same.

Currently I call my net like this

resnet = Sequential([
    Input(shape(224,224,3)),
    ResNet50(weights='imagenet', input_shape(224,224,3))
    ])

I need the Input layer because otherwise the model.compile says that placeholders aren't filled.


Solution

  • Generally there are two ways of achievieng this:

    Quick way - supported functions:

    To change the final layer's activation function, you can pass an argument classifier_activation.
    So in order to get rid of activation all together, your module can be called like:

    import tensorflow as tf
    
    resnet = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(224,224,3)),
        tf.keras.applications.ResNet50(
            weights='imagenet', 
            input_shape=(224,224,3),
            pooling="avg",
            classifier_activation=None
            )
        ])
    

    This however, is not going to work if the you want a different function, that is not supported by Keras classifer_activation parameter (e. g. custom activation function).

    To achieve this you can use the workaround solution:

    Long way - copy the model's weights

    This solution proposes copying the original model's weights onto your custom one. This approach works because apart from the activation function you are not chaning the model's architecture.

    You need to:
    1. Download original model.
    2. Save it's weights.
    3. Declare your modified version of the model (in your case, without the activation function).
    4. Set the weights of the new model.

    Below snippet explains this concept in more detail:

    import tensorflow as tf
    
    # 1. Download original resnet
    
    resnet = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(224,224,3)),
        tf.keras.applications.ResNet50(
            weights='imagenet', 
            input_shape=(224,224,3),
            pooling="avg"
            )
        ])
    
    # 2. Hold weights in memory:
    imagenet_weights = resnet.get_weights()
    
    # 3. Declare the model, but without softmax
    resnet_no_softmax = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(224,224,3)),
        tf.keras.applications.ResNet50(
            include_top=False,
            weights='imagenet', 
            input_shape=(224,224,3),
            pooling="avg"
            ),
        tf.keras.layers.Dense(1000, name='fc1000')
    ])
    
    # 4. Pass the imagenet weights onto the second resnet
    resnet_no_softmax.set_weights(imagenet_weights)
    
    

    Hope this helps!