Search code examples
tensorflowdetection

SSD for detecting small objects


I'm going to use Google Detection API's SSD model for detecting small objects (Like a Volleyball in a picture).

I want to change the following parametrs in config file (Aspect ratio, scale,..) :

anchor_generator {
  ssd_anchor_generator {
    num_layers: 6
    min_scale: 0.2
    max_scale: 0.95
    aspect_ratios: 1.0
    aspect_ratios: 2.0
    aspect_ratios: 0.5
    aspect_ratios: 3.0
    aspect_ratios: 0.3333

I have three questions:

  • For modifying these parameters (scale, aspect ratio,...), Do I need to re-train the model? or after these modification I still can use pre-trained models and fine-tune for my data?

  • Since the objects that I want to detect them are all small comparing to the image size, does increasing or decreasing number of conv layers in mobilenet improve (speed or accuracy) detection? if yes, in which file I can apply these changes?

  • Is there any specific method for modifying SSD mobilenet detector to make it work better for detecting small objects? For example I know a 4X4 feature map (grid) is too big for size of my object, is there any way to remove course grids and just keep fine grids (like 8X8)?

Thank you.


Solution

  • Q: For modifying these parameters (scale, aspect ratio,...), Do I need to re-train the model? or after these modification I still can use pre-trained models and fine-tune for my data?

    Yes you need to retrain your model, because the network was trained to find volleyballs at that specific aspect ratio, however you can retrain the existing network rather than using a new one.

    Q: Since the objects that I want to detect them are all small comparing to the image size, does increasing or decreasing number of conv layers in mobilenet improve (speed or accuracy) detection? if yes, in which file I can apply these changes?

    Typically you want the near minimal size network that still produces reasonable accuracy, however I am not sure if tensorflow allows for setting image resampling limits, because at that size of an image you might resample the volleyball out of the image altogether. Training is the slowest part of any neural network, and running a query for detection is not much of a concern for performance.

    Q: Is there any specific method for modifying SSD mobilenet detector to make it work better for detecting small objects? For example I know a 4X4 feature map (grid) is too big for size of my object, is there any way to remove course grids and just keep fine grids (like 8X8)?

    As mentioned before resampling would be important to set, however I am not sure if it is possible. Perhaps that might be the scale parameters.