Search code examples
python-3.xkerasdeep-learningimagenet

How dropout is implemented in Keras mobilenet v3 imagenet weights during transfer learning when some layers are frozen (made un-trainable)?


I am working on an image classification problem and was using 90% pre-trained Keras mobilenet v3 on ImageNet and remaining 10% layers are made trainable whilst applying dropout of 0.2. I was wondering how this was being handled in the backend.

MobileNetV3Small(input_shape=(IMG_HEIGHT, IMG_WIDTH, DEPTH), 
                 alpha=1.0, 
                 minimalistic=False, 
                 include_top=False,
                 weights='imagenet', 
                 input_tensor=None, 
                 pooling='max',
                 dropout_rate=0.2)

Solution

  • If the layer is called with parameter training=False, like when you predict, nothing will happen. Let's start with some input:

    import tensorflow as tf
    
    rate = 0.4
    
    dropout = tf.keras.layers.Dropout(rate) 
    
    x = tf.cast(tf.reshape(tf.range(1, 10), (3, 3)), tf.float32)
    
    <tf.Tensor: shape=(3, 3), dtype=float32, numpy=
    array([[1., 2., 3.],
           [4., 5., 6.],
           [7., 8., 9.]], dtype=float32)>
    

    Now, let's call the dropout model while training:

    dropout(x, training=True)
    
    <tf.Tensor: shape=(3, 3), dtype=float32, numpy=
    array([[ 0.       ,  3.3333333,  0.       ],
           [ 6.6666665,  8.333333 ,  0.       ],
           [11.666666 , 13.333333 , 15.       ]], dtype=float32)>
    

    As you can see, all the remaining values are multiplied by 1/(1-p). Now let's call the network when training=False:

    dropout(x, training=False)
    
    <tf.Tensor: shape=(3, 3), dtype=float32, numpy=
    array([[1., 2., 3.],
           [4., 5., 6.],
           [7., 8., 9.]], dtype=float32)>
    

    Nothing happens.