I have the following output but can't figure out how to evaluate because there is no F1 score
or confusion matrix
.
Average Recall (AR) @[ IoU=0.50:0.95 | area= small |maxDets=100] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.250
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.410
20/499 0.001595 0.6697 0 1.393: 100%|██████████| 12/12 [00:
21/499 0.001594 0.6417 0 1.353: 100%|██████████| 12/12 [00:
22/499 0.001594 0.6727 0 1.431: 100%|██████████| 12/12 [00:
I trained for 400 epochs, and this is just a small part of the output. I can't see the mAP either.
I have this line to eval
!python tools/eval.py --data Fabric-Defect-2/data.yaml --weights runs/train/exp/weights/best_ckpt.pt --device 0
Is there a way to obtain detailed evaluation metrics such as F1 score
, confusion matrix
, and mAP
?
Try this
!python tools/eval.py --data Fabric-Defect-2/data.yaml --weights runs/train/exp/weights/best_ckpt.pt --device 0 do_pr_metric True --plot_confusion_matrix --plot_curve True
Adding those 3 arguments gave me a confusion matrix, F1 curve, P curve, PR curve, and R curve graphs.
These are the explanations of those arguments that i found in the eval.py script:
And I suggest you to also play around with the --task flag. It has 3 options ('val, test, or speed'). I haven't tried test and speed, so I don't know what the output. Play around and see which one you really need.