python pytorch bounding-box google-vision yolov5

convert boundingPoly to yolo format

Yolov5 doesn't support segmentation labels and I need to convert it into the correct format.

How would you convert this to yolo format?

        "boundingPoly": {
            "normalizedVertices": [{
                "x": 0.026169369
            }, {
                "x": 0.99525446
            }, {
                "x": 0.99525446,
                "y": 0.688811
            }, {
                "x": 0.026169369,
                "y": 0.688811
            }]
        }

The yolo format looks like this

0 0.588196 0.474138 0.823607 0.441645
<object-class> <x> <y> <width> <height>

Solution

After our back and forth in the comments I have enough info to answer your question. This is output from the Google Vision API. The normalizedVertices are similar to the YOLO format, because they are "normalized" meaning the coordinates are scaled between 0 and 1 as opposed to being pixels from 1 to n. Still, you need to do some transformation to put into the YOLO format. In the YOLO format, the X and Y values in the 2nd and 3rd columns refer to the center of the bounding box, as opposed to one of the corners.

Here is a code snipped that will sample at https://ghostbin.com/hOoaz/raw into the follow string in YOLO format '0 0.5080664305 0.5624289849999999 0.9786587390000001 0.56914843'

#Sample annotation output 
json_annotation = """
      [
        {
          "mid": "/m/01bjv",
          "name": "Bus",
          "score": 0.9459266,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.018737061,
                "y": 0.27785477
              },
              {
                "x": 0.9973958,
                "y": 0.27785477
              },
              {
                "x": 0.9973958,
                "y": 0.8470032
              },
              {
                "x": 0.018737061,
                "y": 0.8470032
              }
            ]
          }
        }
      ]
"""

import json
json_object = json.loads(json_annotation, strict=False)

#Map all class names to class id
class_dict = {"Bus": 0}
#Get class id for this record
class_id = class_dict[json_object[0]["name"]]

#Get the max and min values from segmented polygon points 
normalizedVertices = json_object[0]["boundingPoly"]["normalizedVertices"]
max_x = max([v['x'] for v in normalizedVertices])
max_y = max([v['y'] for v in normalizedVertices])
min_x = min([v['x'] for v in normalizedVertices])
min_y = min([v['y'] for v in normalizedVertices])

width = max_x - min_x
height = max_y - min_y 
center_x = min_x + (width/2)
center_y = min_y + (height/2)

yolo_row = str(f"{class_id} {center_x} {center_y} {width} {height}")
print(yolo_row)

If you are trying to train a YOLO model there are a few more steps you will need to do: You need to setup the images and annotations in a particular folder structure. But this should help you convert your annotations.