I'm trying to do transfer learning on MobileNetV3-Small using Tensorflow 2.5.0 to predict dog breeds (133 classes) and since it got reasonable accuracy on the ImageNet dataset (1000 classes) I thought it should have no problem adapting to my problem.
I've tried a multitude of training variations and recently had a breakthrough but now my training stagnates at about 60% validation accuracy with minor fluctuations in validation loss (accuracy and loss curves for training and validation below).
I tried using ReduceLROnPlateau
in the 3rd graph below, but it didn't help to improve matters. Can anyone suggest how I could improve the training?
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.layers import GlobalMaxPooling2D, Dense, Dropout, BatchNormalization
from tensorflow.keras.applications import MobileNetV3Large, MobileNetV3Small
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True # needed for working with this dataset
# define generators
train_datagen = ImageDataGenerator(vertical_flip=True, horizontal_flip=True,
rescale=1.0/255, brightness_range=[0.5, 1.5],
zoom_range=[0.5, 1.5], rotation_range=90)
test_datagen = ImageDataGenerator(rescale=1.0/255)
train_gen = train_datagen.flow_from_directory(train_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
val_gen = test_datagen.flow_from_directory(val_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
test_gen = test_datagen.flow_from_directory(test_dir, target_size=(224,224),
batch_size=32, class_mode="categorical")
pretrained_model = MobileNetV3Small(input_shape=(224,224,3), classes=133,
weights="imagenet", pooling=None, include_top=False)
# set all layers trainable because when I froze most of the layers the model didn't learn so well
for layer in pretrained_model.layers:
layer.trainable = True
last_output = pretrained_model.layers[-1].output
x = GlobalMaxPooling2D()(last_output)
x = BatchNormalization()(x)
x = Dense(512, activation='relu')(x)
x = Dense(133, activation='softmax')(x)
model = Model(pretrained_model.input, x)
model.compile(optimizer=Adam(learning_rate=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
# val_acc with min_delta 0.003; val_loss with min_delta 0.01
plateau = ReduceLROnPlateau(monitor="val_loss", mode="min", patience=5,
min_lr=1e-8, factor=0.3, min_delta=0.01,
verbose=1)
checkpointer = ModelCheckpoint(filepath=savepath, verbose=1, save_best_only=True,
monitor="val_accuracy", mode="max",
save_weights_only=True)
Your code looks good, but it seems to have one issue - you might be rescaling the inputs twice. According to the docs for MobilenetV3:
The preprocessing logic has been included in the mobilenet_v3 model implementation. Users are no longer required (...) to normalize the input data.
Now, in your code, there is:
test_datagen = ImageDataGenerator(rescale=1.0/255)
which essentially, makes the first model layers to rescale, already rescaled values.
The same applies for train_datagen
.
You could try removing the rescale
argument from both train
and test
loaders, or setting rescale=None
.
This could also explain why the model did not learn well with the backbone frozen.