machine-learning deep-learning conv-neural-network image-recognition object-detection

When should you use pretrained weights when training deep learning models?

I am interested in training a range of image and object detection models and I am wondering what the general rule of when to use pretrained weights of a network like VGG16 is.

For example, it seems obvious that fine-tuning pre-trained VGG16 imagenet model weights is helpful you are looking for a subset ie. Cats and Dogs.

However it seems less clear to me whether using these pretrained weights is a good idea if you are training an image classifier with 300 classes with only some of them being subsets of the classes in the pretrained model.

What is the intuition around this?

Solution

Lower layers learn features that are not necessarily specific to your application/dataset: corners, edges , simple shapes, etc. So it does not matter if your data is strictly a subset of the categories that the original network can predict.

Depending on how much data you have available for training, and how similar the data is to the one used in the pretrained network, you can decide to freeze the lower layers and learn only the higher ones, or simply train a classifier on top of your pretrained network.

Check here for a more detailed answer