I'd like to use the Tensorflow Object Detection API to identify objects in a series of webcam images. The Faster RCNN models pre-trained on the COCO dataset appear to be suitable, as they contain all the object categories I need.
However, I'd like to improve the performance of the model at identifying fairly small objects within each image. If I understand correctly, I need to edit the anchor scales
parameter in the config file to get the model to use smaller bounding boxes.
My questions are:
I'm currently feeding 1280x720 images to the model. At around 200x150 pixels I'm finding it harder to detect objects.
You'll need to retrain completely unfortunately, since the weights do depend on the shape of the anchor.
Having a feature map with higher resolution should help (but slow down the process), so changing the feature extractor to get one with less input size reduction (max poolings with stride >1 is usually what reduces the space size) or upscaling the image a bit in the initial image resizer.