python tensorflow keras deep-learning transfer-learning

Training a model with Resnet152, saving the weights, loading them and adding more layers issue

My goal is to first train only using ResNet152 and then save the learned weights. Then i want to use these weights as a base for a more complex model with added layers which i ultimately want to do hyperparameter tuning on. The reason for this approach is that doing it all at once takes a very long time. The problem i am having is that my code doesnt seem to work. I dont get an error message but when i start training the more complex model it seems to start from 0 again and not using the learned ResNet152 weights.

Here is the code:

First i am only using ResNet152 and the output layer

input_tensor = Input(shape=train_generator.image_shape)

base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)

for layer in base_model.layers[:]:
   layer.trainable = True¨

x = Flatten()(base_model.output)
  
predictions = Dense(num_classes, activation= 'softmax')(x)

model = Model(inputs = base_model.input, outputs = predictions)

model.compile(
  loss='sparse_categorical_crossentropy',
  optimizer=opt,
  metrics=['accuracy'])

model.fit(
  train_generator,
  validation_data=valid_generator,
  epochs=epochs,
  steps_per_epoch=len_train // batch_size,
  validation_steps=len_val // batch_size,
  callbacks=[earlyStopping, reduce_lr]
)

Then i am saving the weights:

model.save_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5')

Adding more layers.


input_tensor = Input(shape=train_generator.image_shape)

base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)

for layer in base_model.layers[:]:
    layer.trainable = False 

x = Flatten()(base_model.output)
 
x = Dense(1024, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01), 
          kernel_initializer=tf.keras.initializers.HeNormal(), 
          kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
x = Dropout(rate=0.1)(x)
x = Dense(512, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01), 
          kernel_initializer=tf.keras.initializers.HeNormal(), 
          kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
  
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)

Loading the weights after the added layers and using by_name=True, both according to the keras tutorial.

model.load_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5', by_name=True)

Then i start training again.

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=opt,
    metrics=['accuracy']
    )

model.fit(
  train_generator,
  validation_data=valid_generator,
  epochs=epochs,
  steps_per_epoch=len_train // batch_size,
  validation_steps=len_val  // batch_size,
  callbacks=[earlyStopping, reduce_lr]
)

But it is starting at a very low accuracy, basically from 0 again, so im guessing something is wrong here. Any ideas on how to fix this?

Solution

When you use adam and save model weights only - you have to save/load optimizer weights as well:

  weight_values = model.optimizer.get_weights()
  with open(output_path+'optimizer.pkl', 'wb') as f:
      pickle.dump(weight_values, f)

  dummy_input = tf.random.uniform(inp_shape) # create a tensor of input shape
  dummy_label = tf.random.uniform(label_shape) # create a tensor of label shape
  hist = model.fit(dummy_input, dummy_label)
  with open(path_to_saved_model+'optimizer.pkl', 'rb') as f:
      weight_values = pickle.load(f)
  optimizer.set_weights(weight_values)