Running TensorFlow object detection API training and evaluation on customized dataset with 8 classes, I have two questions regarding the outcomes of running this task using model_main.py
The total loss started going up (relatively) after 10k steps ..it went below 1 after 8000 steps but started going up slowly from 10k steps to 80k step and ended with 1.4 loss.. any reason why would this happen?
Regarding the evaluation results, why only the IoU=0.50 has 0.966 precision while the rest are below 0.5 as shown below:
Accumulating evaluation results...
DONE (t=0.07s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.471
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.966
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.438
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.471
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.447
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.562
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.587
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.587
INFO:tensorflow:Finished evaluation at 2019-05-06-03:56:37
INFO:tensorflow:Saving dict for global step 80000: DetectionBoxes_Precision/mAP
Yes, these results are reasonable. Answering your questions:
[ IoU=0.50:0.95 | area= all | maxDets=100 ]
means the precision is calculated with IoU ranging from 0.5 to 0.95 (with 0.05 as step size, all detections with IoU in this range are considered positive detections), and area ranging from small, medium and large, and maximum number of detections of 100. As you can imagine, Lower IoU threshold means more detections will be counted as true positives, so IoU=0.5
has the highest precision score because it has the largest number of positive detections, and when IoU=0.95
, less detections are counted as true positives. IoU=0.50:0.95
is the average of all precisions across different IoUs, so precision for this category is lower than when IoU=0.5
. BTW, the -1.00 when area=small, medium
means such categories are absent, see here. So it means all objects in your dataset are very large.
Here is a good illustration of why lower IoU means more detections are true positives. (image source)
If we would include IoU=0.4, then all three detections are correct detections (true positives), if we set IoU=0.6, then only two are correct, and when IoU=0.9, only one detection is correct.
Some further reading regarding how mAP is calculated.