Search code examples
pythontensorflowmachine-learningkerasvgg-net

How to force Keras VGG16 model show and include detailed layers when being used in new customized models


Summary: How to force keras.applications.VGG16 layers, rather than the vgg model, to show and be included as layers in the new customized models.

Details:

  1. I was building customized models (denoted as model) on top of keras.applications.VGG16 (denoted as conv_base). Specifically, I replace the last dense layers with my own layers.

    conv_base = VGG16(weights='imagenet',  # pre-train with ImageNet
              include_top=False,  # exclude the three top layers
              input_shape=(64, 64, 3),
              pooling = 'max')
    model = models.Sequential()
    model.add(conv_base)
    model.add(layers.BatchNormalization())
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(256, activation='linear'))
    model.add(layers.BatchNormalization())
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(1, activation='linear'))
    
  2. While I can see the layers in conv_base when conv_base.summary(), the new customized model only see the vgg16 layer (type Model), rather than every layer inside vgg16 when model.summary() (shown in the Figure)

    conv_base.summary()
    

Fig. 1: All layers are shown in the conv_base.

    model.summary()

Fig. 2: vgg detailed layers are not included in the model structure, instead vgg is contained as Model.

  1. Associated Issues

Although the vgg layers could be accessible by model.get_layer('vgg16').layers, it still occasionally causes other issues, including:

(1) loading weights: it sometimes messes up the weight loading process.

    model.load_weights('~path/weights.hdf5')  

Error of loading weights

(2) building new model: it also causes errors when calling model layers to build new models.

    model2 = Model(inputs=model.inputs, outputs=model.get_layer('vgg16').layers[1].output, name='Vis_Model') 

Error when calling layers to build new models

Thoughts: I could imagine to partially fix this by copy keras.application.VGG layers one by one into a new model. But how to use the pre-trained weights might be a problem. Any other idea would be appreciated.


Solution

  • EDIT: Based on your comments, here is an updated solution.

    You can flatten the nested model by iterating over the layers and appending them to a sequential model. Here is a great solution that I have used for the code below.

    import numpy as np
    import tensorflow as tf
    from tensorflow.keras.applications import VGG16
    from tensorflow.keras import layers, Model, utils
    
    #Instantiating the VGG model
    conv_base = VGG16(weights='imagenet',  # pre-train with ImageNet
                      include_top=False,  # exclude the three top layers
                      input_shape=(64, 64, 3),
                      pooling = 'max')
    
    #Defining secondary nested model
    inp = layers.Input((64,64,3))
    cnn = conv_base(inp)
    x = layers.BatchNormalization()(cnn)
    x = layers.Dropout(0.2)(x)
    x = layers.Dense(256, activation='linear')(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.2)(x)
    out = layers.Dense(1, activation='linear')(x)
    
    model = Model(inp, out)
    
    #Flattening nested model
    def flatten_model(model_nested):
        layers_flat = []
        for layer in model_nested.layers:
            try:
                layers_flat.extend(layer.layers)
            except AttributeError:
                layers_flat.append(layer)
        model_flat = tf.keras.models.Sequential(layers_flat)
        return model_flat
    
    model_flat = flatten_model(model)
    
    model_flat.summary()
    
    Model: "sequential_1"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_10 (InputLayer)        multiple                  0         
    _________________________________________________________________
    block1_conv1 (Conv2D)        (None, 64, 64, 64)        1792      
    _________________________________________________________________
    block1_conv2 (Conv2D)        (None, 64, 64, 64)        36928     
    _________________________________________________________________
    block1_pool (MaxPooling2D)   (None, 32, 32, 64)        0         
    _________________________________________________________________
    block2_conv1 (Conv2D)        (None, 32, 32, 128)       73856     
    _________________________________________________________________
    block2_conv2 (Conv2D)        (None, 32, 32, 128)       147584    
    _________________________________________________________________
    block2_pool (MaxPooling2D)   (None, 16, 16, 128)       0         
    _________________________________________________________________
    block3_conv1 (Conv2D)        (None, 16, 16, 256)       295168    
    _________________________________________________________________
    block3_conv2 (Conv2D)        (None, 16, 16, 256)       590080    
    _________________________________________________________________
    block3_conv3 (Conv2D)        (None, 16, 16, 256)       590080    
    _________________________________________________________________
    block3_pool (MaxPooling2D)   (None, 8, 8, 256)         0         
    _________________________________________________________________
    block4_conv1 (Conv2D)        (None, 8, 8, 512)         1180160   
    _________________________________________________________________
    block4_conv2 (Conv2D)        (None, 8, 8, 512)         2359808   
    _________________________________________________________________
    block4_conv3 (Conv2D)        (None, 8, 8, 512)         2359808   
    _________________________________________________________________
    block4_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
    _________________________________________________________________
    block5_conv1 (Conv2D)        (None, 4, 4, 512)         2359808   
    _________________________________________________________________
    block5_conv2 (Conv2D)        (None, 4, 4, 512)         2359808   
    _________________________________________________________________
    block5_conv3 (Conv2D)        (None, 4, 4, 512)         2359808   
    _________________________________________________________________
    block5_pool (MaxPooling2D)   (None, 2, 2, 512)         0         
    _________________________________________________________________
    global_max_pooling2d_3 (Glob (None, 512)               0         
    _________________________________________________________________
    batch_normalization_4 (Batch (None, 512)               2048      
    _________________________________________________________________
    dropout_4 (Dropout)          (None, 512)               0         
    _________________________________________________________________
    dense_4 (Dense)              (None, 256)               131328    
    _________________________________________________________________
    batch_normalization_5 (Batch (None, 256)               1024      
    _________________________________________________________________
    dropout_5 (Dropout)          (None, 256)               0         
    _________________________________________________________________
    dense_5 (Dense)              (None, 1)                 257       
    =================================================================
    Total params: 14,849,345
    Trainable params: 14,847,809
    Non-trainable params: 1,536
    _________________________________________________________________
    

    I would recommend utilizing an alternate way of summarizing the model.

    You can use the utils.plot_model with expand_nested=True for this purpose.

    tf.keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, expand_nested=True)
    

    enter image description here