I have fine-tuned a faster_rcnn_resnet101 model available on the Model Zoo to detect my custom objects. I had the data split into train and eval set, and I used them in the config file while training. Now after training has completed, I want to test my model on an unseen data (I call it the test data). I used a couple of functions but can not figure out for certain which code to use from the tensorflow's API to evaluate the performance on the test dataset. Below are the things that I tried:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.459
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.601
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.543
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.459
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.543
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.627
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Now, I know that mAP and AR can't be negative and there is something wrong. I want to know why do I see negative values when I run the offline evaluation on the test dataset?
The query that I used to run this pipeline is: SPLIT=test
echo "
label_map_path: '/training_demo/annotations/label_map.pbtxt'
tf_record_input_reader: { input_path: '/training_demo/Predictions/test.record' }
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
echo "
metrics_set: 'coco_detection_metrics'
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
python object_detection/metrics/offline_eval_map_corloc.py \
--eval_dir='/training_demo/test_eval_metrics' \
--eval_config_path='training_demo/test_eval_metrics/test_eval_config.pbtxt' \
--input_config_path='/training_demo/test_eval_metrics/test_input_config.pbtxt'
DetectionBoxes_Recall/AR@100 (medium): -1.0 DetectionBoxes_Recall/AR@100 (small): -1.0 DetectionBoxes_Precision/[email protected]: -1.0 DetectionBoxes_Precision/mAP (medium): -1.0 etc.
I used the pipeline, python eval.py \ --logtostderr \ --checkpoint_dir=trained-inference-graphs/output_inference_graph/ \ --eval_dir=test_eval_metrics \ --pipeline_config_path=training/faster_rcnn_resnet101_coco-Copy1.config
The eval_input_reader in the faster_rcnn_resnet101_coco-Copy1.config pointing to the test TFRecord with ground truth and detection information.
I would appreciate any help on this.
The evalution metrics is of COCO format so you can refer to COCO API for the meaning of these values.
As specified in coco api code, -1
is the default value if the category is absent. In your case, all objects detected only belong to 'small' area. Also area categories of 'small', 'medium' and 'large' depend on the pixels the area takes as specified here.