Search code examples
kerastransfer-learning

Is it possible to train a CNN starting at an intermediate layer (in general and in Keras)?


I'm using mobilenet v2 to train a model on my images. I've frozen all but a few layers and then added additional layers for training. I'd like to be able to train from an intermediate layer rather than from the beginning. My questions:

  1. Is it possible to provide the output of the last frozen layer as the input for training (it would be a tensor of (?, 7,7,1280))?
  2. How does one specify training to start from that first trainable (non-frozen) layer? In this case, mbnetv2_conv.layer[153].
  3. What is y_train in this case? I don't quite understand how y_train is being used during the training process- in general, when does the CNN refer back to y_train?

Load mobilenet v2

image_size = 224
mbnetv2_conv = MobileNetV2(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))

    # Freeze all layers except the last 3 layers
    for layer in mbnetv2_conv.layers[:-3]:
        layer.trainable = False

    # Create the model
    model = models.Sequential()
    model.add(mbnetv2_conv)
    model.add(layers.Flatten())
    model.add(layers.Dense(16, activation='relu')) 
    model.add(layers.Dropout(0.5)) 
    model.add(layers.Dense(3, activation='softmax')) 
    model.summary()

    # Build an array (?,224,224,3) from images
    x_train = np.array(all_images)

    # Get layer output
    from keras import backend as K
    get_last_frozen_layer_output = K.function([mbnetv2_conv.layers[0].input],
                                  [mbnetv2_conv.layers[152].output])
    last_frozen_layer_output = get_last_frozen_layer_output([x_train])[0]

    # Compile the model
    from keras.optimizers import SGD
    sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['acc'])

    # how to train from a specific layer and what should y_train be?
    model.fit(last_frozen_layer_output, y_train, batch_size=2, epochs=10)

Solution

  • Yes, you can. Two different ways.

    First, the hard way makes you build two new models, one with all your frozen layers, one with all your trainable layers. Add a Flatten() layer to the frozen-layers-only model. And you will copy the weights from mobilenet v2 layer by layer to populate the weights of the frozen-layers-only model. Then you will run your input images through the frozen-layers-only model, saving the output to disk in CSV or pickle form. This is now the input for your trainable-layers model, which you train with the model.fit() command as you did above. Save the weights when you're done training. Then you will have to build the original model with both sets of layers, and load the weights into each layer, and save the whole thing. You're done!

    However, the easier way is to save the weights of your model separately from the architecture with:

    model.save_weights(filename)
    

    then modify the layer.trainable property of the layers in MobileNetV2 before you add it into a new empty model:

    mbnetv2_conv = MobileNetV2(weights='imagenet', include_top=False, input_shape=(image_size, image_size, 3))
    for layer in mbnetv2_conv.layers[:153]:
        layer.trainable = False
    model = models.Sequential() 
    model.add(mbnetv2_conv) 
    

    then reload the weights with

    newmodel.load_weights(filename)
    

    This lets you adjust which layers in your mbnetv2_conv model you will be training on the fly, and then just call model.fit() to continue training.