I was wondering if using unbalanced dataset with YOLO would cause it to train worse in terms of accuracy? Would the classes with less images have less accuracy?
I have 3 classes with 14.4 k images
1 class has 12,000 image examples the other 2 have 1,000 image examples each
would this be an issue?
I am training on YOLOR right now and my MAP is at 0.36 on my custom dataset
I classified with the weights and the classification is good but I need to set the confidence very low as the classes with less images have a very low confidence (0.05 - 0.12) while the class with 12,000 images has confidence (0.45 - 0.90
Dataset disbalance always causes performance decrease. Though, there are a few tricks, which may be helpful in your situation:
compute_class_weight
method.Your low confidence problem may be one of the underfitting consequences. Thats from personal experience with two-stage detectors (mostly Faster-RCNN)