I am training a model with Mask R-CNN that has 2 classes. Vehicles and roads. I have a question about preparing a dataset. Which one is better to get higher accuracy?
>>> 1 - Having the same number of instance in the whole dataset like:
Car Image: 50
Total Cars: 500 (each car image has 10 cars)
Road Image: 500
Total Roads: 500 (each road images has 1 road)
>>> In here the count of roads and cars are equal.
>>> 2 - Having the same number of image in the whole dataset like:
Car Image: 500
Total Cars: 10000 (each car image has 20 cars)
Road Image: 500
Total Roads: 700 (each road images has 1-2 road)
>>> In here the image count of roads and cars are equal.
Which option is better to get higher accuracy? Thank you for your time.
The classification and mask networks will work only on region proposals, linked with object count, so you should focus mainly on the number of cars and roads. But you should also use a dataset as large as possible. If you have enough data, and a well-dimensioned network, unbalanced dataset should not be a problem unless you have a rare class.
First try with your whole dataset, and if you have a problem with road recognition, take a look at this discussion on how to deal with unbalanced dataset: https://datascience.stackexchange.com/questions/38796/unbalanced-training-data-for-different-classes/38815#38815