Search code examples
tensorflowtensorflow2.0object-detectionobject-detection-api

Tensorflow Object Detection API - "fine_tune" vs "detection" vs "classification"


I am following this tutorial: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html

In it, it has the following snippet in the file pipeline.config:

fine_tune_checkpoint_type: "detection" # Set this to "detection" since we want to be training the full detection model

Further investigation leads to the following discoveries:

  • There are at least 3 options for the field fine_tune_checkpoint_type - fine_tune,detection and classification
  • Not all models from the model zoo allow all options.

My questions are:

  • What do each of fine_tune,detection and classification mean in this context, and more importantly when is it appropriate to use each one.
  • How do I tell which options are compatible with models in the model zoo?

Ultimately I wish to do transfer learning - e.g. take an existing trained model and train it to draw boxes for one or more novel classes.


Solution

  • Those options indicates how to restore checkpoints and comes from here

    I copy here the interesting part:

    This option controls how variables are restored from the (pre-trained) fine_tune_checkpoint. For TF2 models, 3 different types are supported:

    1. "classification": Restores only the classification backbone part of the feature extractor. This option is typically used when you want to train a detection model starting from a pre-trained image classification model, e.g. a ResNet model pre-trained on ImageNet.
    2. "detection": Restores the entire feature extractor. The only parts of the full detection model that are not restored are the box and class prediction heads. This option is typically used when you want to use a pre-trained detection model and train on a new dataset or task which requires different box and class prediction heads.
    3. "full":Restores the entire detection model, including the feature extractor, its classification backbone, and the prediction heads. This option should only be used when the pre-training and fine-tuning tasks are the same. Otherwise, the model's parameters may have incompatible shapes, which will cause errors when attempting to restore the checkpoint. For more details about this parameter, see the restore_map (TF1) or restore_from_object (TF2) function documentation in the /meta_architectures/*meta_arch.py files.

    I guess fine_tune is currently replaced by "full". Based on your needs the right choice appear to be "detection". To know which models supports wich options, as indicated above you have to look at the restore_from_object function definition in the proper /meta_architectures/*meta_arch.py files