Search code examples
pythonpytorchbounding-boxgoogle-visionyolov5

convert boundingPoly to yolo format


Yolov5 doesn't support segmentation labels and I need to convert it into the correct format.

How would you convert this to yolo format?

        "boundingPoly": {
            "normalizedVertices": [{
                "x": 0.026169369
            }, {
                "x": 0.99525446
            }, {
                "x": 0.99525446,
                "y": 0.688811
            }, {
                "x": 0.026169369,
                "y": 0.688811
            }]
        }

The yolo format looks like this

0 0.588196 0.474138 0.823607 0.441645
<object-class> <x> <y> <width> <height>

Solution

  • After our back and forth in the comments I have enough info to answer your question. This is output from the Google Vision API. The normalizedVertices are similar to the YOLO format, because they are "normalized" meaning the coordinates are scaled between 0 and 1 as opposed to being pixels from 1 to n. Still, you need to do some transformation to put into the YOLO format. In the YOLO format, the X and Y values in the 2nd and 3rd columns refer to the center of the bounding box, as opposed to one of the corners.

    Here is a code snipped that will sample at https://ghostbin.com/hOoaz/raw into the follow string in YOLO format '0 0.5080664305 0.5624289849999999 0.9786587390000001 0.56914843'

    #Sample annotation output 
    json_annotation = """
          [
            {
              "mid": "/m/01bjv",
              "name": "Bus",
              "score": 0.9459266,
              "boundingPoly": {
                "normalizedVertices": [
                  {
                    "x": 0.018737061,
                    "y": 0.27785477
                  },
                  {
                    "x": 0.9973958,
                    "y": 0.27785477
                  },
                  {
                    "x": 0.9973958,
                    "y": 0.8470032
                  },
                  {
                    "x": 0.018737061,
                    "y": 0.8470032
                  }
                ]
              }
            }
          ]
    """
    
    import json
    json_object = json.loads(json_annotation, strict=False)
    
    #Map all class names to class id
    class_dict = {"Bus": 0}
    #Get class id for this record
    class_id = class_dict[json_object[0]["name"]]
    
    #Get the max and min values from segmented polygon points 
    normalizedVertices = json_object[0]["boundingPoly"]["normalizedVertices"]
    max_x = max([v['x'] for v in normalizedVertices])
    max_y = max([v['y'] for v in normalizedVertices])
    min_x = min([v['x'] for v in normalizedVertices])
    min_y = min([v['y'] for v in normalizedVertices])
    
    width = max_x - min_x
    height = max_y - min_y 
    center_x = min_x + (width/2)
    center_y = min_y + (height/2)
    
    yolo_row = str(f"{class_id} {center_x} {center_y} {width} {height}")
    print(yolo_row)
    

    If you are trying to train a YOLO model there are a few more steps you will need to do: You need to setup the images and annotations in a particular folder structure. But this should help you convert your annotations.