Note: I am not sure this is the right website to ask these kind of questions. Please tell me where I should ask them before downvoting this "because this isn't the right place to ask". Thanks!
I am currently experimenting with deep learning using Keras. I tried already a model similar to the one to be found on the Keras example. This yields expecting results:
After this I wanted to try transfer learning. I did this by using the VGG16 network without retraining its weights (see code below). This gave very poor results: 63% accuracy after 10 epochs with a very shallow curve (see picture below) which seems to be indicating that it will achieve acceptable results only (if ever) after a very long training time (I would expect 200-300 epochs before it reaches 80%).
Is this normal behavior for this kind of application? Here are a few things I could imagine to be the cause of these bad results:
32x32
pixels, which might be too few for the VGG16 nettrainable
or by starting with random weights (only copying the model and not the weights)Thanks in advance!
My code:
Note that the inputs are 2 datasets (50000 training images and 10000 testing images) which are labeled images with shape 32x32x3
. Each pixel value is a float in the range [0.0, 1.0]
.
import keras
# load and preprocess data...
# get VGG16 base model and define new input shape
vgg16 = keras.applications.vgg16.VGG16(input_shape=(32, 32, 3),
weights='imagenet',
include_top=False)
# add new dense layers at the top
x = keras.layers.Flatten()(vgg16.output)
x = keras.layers.Dense(1024, activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
x = keras.layers.Dense(128, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)
# define and compile model
model = keras.Model(inputs=vgg16.inputs, outputs=predictions)
for layer in vgg16.layers:
layer.trainable = False
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# training and validation
model.fit(x_train, y_train,
batch_size=256,
epochs=10,
validation_data=(x_test, y_test))
model.evaluate(x_test, y_test)
It’s not that the VGG16 model doesn’t work on that input size it’s that the weights you’re using have been pre-trained on a different input size (imagenet). You need your source and target dataset to have the same input size so the pre-trained weights can transfer. So you could either do pre-training with rescaled imagenet images to 32x32x3 or pick a target dataset that is roughly the same as the pre-training was done on (often 224x224x3 or similar for imagenet) and scale to match. I have seen a paper recently where they transferred from imagenet to cifar10 and 100, by up scaling the latter, which worked reasonably well, but that wouldn’t be ideal.
Second with a target dataset with that many training examples freezing all the layers transferred is unlikely to be a good solution in fact setting all the layers to trainable will probably work best.