Search code examples
pythontensorflowtensorflow2.0object-detectionobject-detection-api

How to train and eval at the same time in Tensorflow Object Detection API v2


I was wondering how can I train and evaluate the model at each checkpoint in Tf2 object detection API. In the documentation, they suggest to train and then evaluate the model

train

python object_detection/model_main_tf2.py \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --model_dir=${MODEL_DIR} \
    --alsologtostderr

evaluate

python object_detection/model_main_tf2.py \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --model_dir=${MODEL_DIR} \
    --checkpoint_dir=${CHECKPOINT_DIR} \
    --alsologtostderr

what I want is to do the training and after each checkpoint created (1000 step) have an evaluation done. I know that in TF-1 object detection API the evaluation is done automatically each 1000 step and that what I want to replicate in TF-2


Solution

  • Acoording to the model_main_tf2.py, they have left a flag for evaluation at each epoch but it is only available for distributed training: https://github.com/tensorflow/models/blob/master/research/object_detection/model_main_tf2.py#L37-L38

    However, I think they are on the way making it available soon when they have add some flag by including evaluation_data: https://github.com/tensorflow/models/blob/master/research/object_detection/model_main_tf2.py#L39-L44

    For further detail, you can look in the train function of them here, which is described for their training workflow. You can see they havent implement yet the evaluation: https://github.com/tensorflow/models/blob/master/research/object_detection/model_lib_v2.py#L419

    Therefore, you have to work around by creating a new train_file that use the lib here with some evaluation method that has been created by them and attach to your training file to evaluate every epoch. Or you can wait :D