Search code examples
deep-learningobject-detectionyolo

object detection of a high aspect ratio object


I want to train a deep learning model (say SSD or yolo) for object detection. The object I want to detect has very high aspect ratio, say a pencil. I want the output bounding box as close as possible to the object with similar aspect ratio. How should I optimize the model for this? Should I optimize all aspect ratio of pre-defined boxes to make them closer to the real object? For my case, the object is always in one orientation. Thanks


Solution

  • Yes, it would be better to use anchors/default boxes with aspect-ratio similar to what you see in your data.

    For example, if you use TF Object Detection API, each model contains a config file with the different model configurations.
    i.e.: https://github.com/tensorflow/models/blob/master/research/object_detection/configs/tf2/ssd_mobilenet_v2_320x320_coco17_tpu-8.config

      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    

    Usually the term aspect ration is referring to the result of width/height
    So, in case you would have wanted only landscape-like objects you would keep only the aspect ratios bigger than 1 (2.0,3.0)

    Also, just for emphasizing this point, giving aspect-ratios similar to what you expect can be seen in the literature. For example - YOLOV3 article (https://arxiv.org/pdf/1804.02767.pdf) yolov3-choosing-default boxes

    In yolov3 - Redmond chose the anchors after analyzing the most probable object shapes in coco.