My goal is to first train only using ResNet152 and then save the learned weights. Then i want to use these weights as a base for a more complex model with added layers which i ultimately want to do hyperparameter tuning on. The reason for this approach is that doing it all at once takes a very long time. The problem i am having is that my code doesnt seem to work. I dont get an error message but when i start training the more complex model it seems to start from 0 again and not using the learned ResNet152 weights.
Here is the code:
First i am only using ResNet152 and the output layer
input_tensor = Input(shape=train_generator.image_shape)
base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)
for layer in base_model.layers[:]:
layer.trainable = True¨
x = Flatten()(base_model.output)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model.fit(
train_generator,
validation_data=valid_generator,
epochs=epochs,
steps_per_epoch=len_train // batch_size,
validation_steps=len_val // batch_size,
callbacks=[earlyStopping, reduce_lr]
)
Then i am saving the weights:
model.save_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5')
Adding more layers.
input_tensor = Input(shape=train_generator.image_shape)
base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)
for layer in base_model.layers[:]:
layer.trainable = False
x = Flatten()(base_model.output)
x = Dense(1024, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01),
kernel_initializer=tf.keras.initializers.HeNormal(),
kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
x = Dropout(rate=0.1)(x)
x = Dense(512, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01),
kernel_initializer=tf.keras.initializers.HeNormal(),
kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
Loading the weights after the added layers and using by_name=True, both according to the keras tutorial.
model.load_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5', by_name=True)
Then i start training again.
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
model.fit(
train_generator,
validation_data=valid_generator,
epochs=epochs,
steps_per_epoch=len_train // batch_size,
validation_steps=len_val // batch_size,
callbacks=[earlyStopping, reduce_lr]
)
But it is starting at a very low accuracy, basically from 0 again, so im guessing something is wrong here. Any ideas on how to fix this?
When you use adam and save model weights only - you have to save/load optimizer weights as well:
weight_values = model.optimizer.get_weights()
with open(output_path+'optimizer.pkl', 'wb') as f:
pickle.dump(weight_values, f)
dummy_input = tf.random.uniform(inp_shape) # create a tensor of input shape
dummy_label = tf.random.uniform(label_shape) # create a tensor of label shape
hist = model.fit(dummy_input, dummy_label)
with open(path_to_saved_model+'optimizer.pkl', 'rb') as f:
weight_values = pickle.load(f)
optimizer.set_weights(weight_values)