tensorflow machine-learning computer-vision tensorflow2.0 object-detection

Tensorflow Object Detection API: Train from exported model checkpoint

I have a previously exported a RetinaNet model (originally from the object detection zoo) that has been fine tuned on a custom dataset with the Tensorflow Object Detection API (Tensorflow version 2.4.1). Below is how the exported model's folder looks.

When running the evaluation (like below) on the model it has a [email protected] of 0.5.

python model_main_tf2.py --model_dir=exported-models/retinanet --pipeline_config_path=exported-models/retinanet/pipeline.config --checkpoint_dir=exported-models/retinanet/checkpoint

The question

Due to unfortunate circumstances, I do not have the training folder from when the model was trained. As I recently got more data, I would like to use the exported model as a starting point for further training and have set the fine_tune_checkpoint: "exported-models/retinanet/checkpoint/ckpt-0" in the pipeline.config for the new training:

  fine_tune_checkpoint: "exported-models/retinanet/checkpoint/ckpt-0"
  num_steps: 25000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  use_bfloat16: false
  fine_tune_checkpoint_version: V2

However, when starting the training with the model_main_tf2.py script, the first checkpoint (which is at step 0) has a terrible score - even on the same dataset that the evaluation was run on for the exported model.

I would expect the first checkpoint to have the same score (at least for the same test-set) as the exported model's score. Is this wrong to assume and in that case why?

Solution

I finally found the following here:

// Whether to load all checkpoint vars that match model variable names and
// sizes. This option is only available if `from_detection_checkpoint` is
// True.  This option is *not* supported for TF2 --- setting it to true
// will raise an error. **Instead, set fine_tune_checkpoint_type: 'full'.**
  optional bool load_all_detection_checkpoint_vars = 19 [default = false];

By setting fine_tune_checkpoint_type to "full", I got the correct mAP for the first checkpoint (at 0 steps).