Search code examples
pythonmachine-learninglabelobject-detectionyolo

convert Kitti labels to Yolo


Trying to convert Kitti label format to Yolo. But after converting the bbox is misplaced. this is kitti bounding box

this is Kitti bounding box

This is conversion code:

def convertToYoloBBox(bbox, size):
# Yolo uses bounding bbox coordinates and size relative to the image size.
# This is taken from https://pjreddie.com/media/files/voc_label.py .
dw = 1. / size[0]
dh = 1. / size[1]
x = (bbox[0] + bbox[1]) / 2.0
y = (bbox[2] + bbox[3]) / 2.0
w = bbox[1] - bbox[0]
h = bbox[3] - bbox[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)


convert =convertToYoloBBox([kitti_bbox[0],kitti_bbox[1],kitti_bbox[2],kitti_bbox[3]],image.shape[:2])

The function does some normalization which is essential for yolo and outputs following:

(0.14763590391908976, 0.3397063758389261, 0.20452591656131477, 0.01810402684563757)

but when i try to check if the normalization is being done correctly with this code:

x = int(convert[0] * image.shape[0])
y = int(convert[1] * image.shape[1])
width = x+int(convert[2] * image.shape[0]) 
height = y+ int(convert[3] * image.shape[1])

cv.rectangle(image, (int(x), int(y)), (int(width), int(height)), (255,0,0), 2 )

the bounding box is misplaced: enter image description here

Any suggestions ? Is conversion fucntion correct? or the problem is in the checking code ?


Solution

  • You got the centroid calculation wrong.

    Kitti labels are given in the order of left, top, right, and bottom.

    to get the centroid you have to do (left + right)/ 2 and (top + bottom)/2

    so your code will become

    x = (bbox[0] + bbox[2]) / 2.0
    
    y = (bbox[1] + bbox[3]) / 2.0
    
    w = bbox[2] - bbox[0]
    
    h = bbox[3] - bbox[1]