tensorflow deep-learning google-colaboratory tensorflow-model-garden

colab_utils.annotate(), annotation format

I am following the Tensorflow notebook for Few shot learning ( https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/colab_tutorials/eager_few_shot_od_training_tf2_colab.ipynb#scrollTo=RW1FrT2iNnpy )

In it, I saw that they were annotating the images using colab_utils.annotate(). I can't understand the annotation format they are using (like YOLO or COCO format). Another problem is that we can't specify the classes at the time when we are drawing the bounding boxes and I have to remember the order in which I annotate the different images and classes so I can add them by code later on.

If someone can tell me what's that format so I can annotate the images on my PC locally rather than on COLAB which will save a lot of time.

Any help would be appreciated. Regards

Solution

The colab_utils annotation tools is only practical for a single class. Below is the format from the source code:

        [
          // stuff for image 1
          [
            // stuff for rect 1
            {x, y, w, h},
            // stuff for rect 2
            {x, y, w, h},
            ...
          ],
          // stuff for image 2
          [
            // stuff for rect 1
            {x, y, w, h},
            // stuff for rect 2
            {x, y, w, h},
            ...
          ],
          ...
        ]

As the annotations don't include any reference ID to the source image, order matters and you have to match the order of the box array with the order of your images; this tool is probably not practical for a large training set. The example from the colab you provided, below, is thus the example to follow.

gt_boxes = [
            np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
            np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
            np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
            np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
            np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]