Search code examples
pythontensorflowpre-trained-model

Discrepancy in the number of trainable parameters between model.summary and len(conv_model.trainable_weights)


Consider this tensorflow python code that loads a pretrained model:

import tensorflow as tf
conv_model = keras.applications.vgg16.VGG16(
    weights='imagenet',
    include_top=False)
conv_model.trainable=False
print("Number of trainable weights after freezing: ", len(conv_model.trainable_weights))
conv_model.trainable=True
print("Number of trainable weights after defreezing: ", len(conv_model.trainable_weights))

and I got printed

Number of trainable weights after freezing:  0
Number of trainable weights after defreezing:  26

However, if I do

conv_model.trainable=True
conv_model.summary()

I get:

Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

and if I freeze I get 0 trainable paraemters.

Why there is this discrepancy between model.summary() and the other method?


Solution

  • Length of the weights doesnt give the total parameters. You should use:

    from keras.utils.layer_utils import count_params
    np.sum([count_params(p) for p in conv_model.trainable_weights])
    #14714688
    

    instead of,

    len(conv_model.trainable_weights)
    

    Length gives the number of kernels and biases and each of them can be inspected by:

    for p in conv_model.trainable_weights:
       print (p.name, p.shape, np.cumprod(p.shape)[-1], count_params(p))
    
    #outputs 26 conv layers  shape          params params
    
    block1_conv1/kernel:0   (3, 3, 3, 64)    1728   1728
    block1_conv1/bias:0     (64,)            64     64
    block1_conv2/kernel:0   (3, 3, 64, 64)   36864  36864
    ...
    block5_conv3/kernel:0   (3, 3, 512, 512) 2359296 2359296
    block5_conv3/bias:0     (512,)           512     512