Search code examples
pythonopencvyolo

How to convert Yolo format bounding box coordinates into OpenCV format


I have Yolo format bounding box annotations of objects saved in a .txt files. Now I want to load those coordinates and draw it on the image using OpenCV, but I don’t know how to convert those float values into OpenCV format coordinates values

I tried this post but it didn’t help, below is a sample example of what I am trying to do

Code and output

import matplotlib.pyplot as plt
import cv2

img = cv2.imread(<image_path>)
dh, dw, _ = img.shape
        
fl = open(<label_path>, 'r')
data = fl.readlines()
fl.close()
        
for dt in data:
            
    _, x, y, w, h = dt.split(' ')
            
    nx = int(float(x)*dw)
    ny = int(float(y)*dh)
    nw = int(float(w)*dw)
    nh = int(float(h)*dh)
            
    cv2.rectangle(img, (nx,ny), (nx+nw,ny+nh), (0,0,255), 1)
            
plt.imshow(img)

enter image description here

Actual Annotations and Image

0 0.286972 0.647157 0.404930 0.371237 
0 0.681338 0.366221 0.454225 0.418060

enter image description here


Solution

  • There's another Q&A on this topic, and there's this1 interesting comment below the accepted answer. The bottom line is, that the YOLO coordinates have a different centering w.r.t. to the image. Unfortunately, the commentator didn't provide the Python port, so I did that here:

    import cv2
    import matplotlib.pyplot as plt
    
    img = cv2.imread(<image_path>)
    dh, dw, _ = img.shape
    
    fl = open(<label_path>, 'r')
    data = fl.readlines()
    fl.close()
    
    for dt in data:
    
        # Split string to float
        _, x, y, w, h = map(float, dt.split(' '))
    
        # Taken from https://github.com/pjreddie/darknet/blob/810d7f797bdb2f021dbe65d2524c2ff6b8ab5c8b/src/image.c#L283-L291
        # via https://stackoverflow.com/questions/44544471/how-to-get-the-coordinates-of-the-bounding-box-in-yolo-object-detection#comment102178409_44592380
        l = int((x - w / 2) * dw)
        r = int((x + w / 2) * dw)
        t = int((y - h / 2) * dh)
        b = int((y + h / 2) * dh)
        
        if l < 0:
            l = 0
        if r > dw - 1:
            r = dw - 1
        if t < 0:
            t = 0
        if b > dh - 1:
            b = dh - 1
    
        cv2.rectangle(img, (l, t), (r, b), (0, 0, 255), 1)
    
    plt.imshow(img)
    plt.show()
    

    So, for some Lenna image, that'd be the output, which I think shows the correct coordinates w.r.t. your image:

    Output

    ----------------------------------------
    System information
    ----------------------------------------
    Platform:     Windows-10-10.0.16299-SP0
    Python:       3.8.5
    Matplotlib:   3.3.2
    OpenCV:       4.4.0
    ----------------------------------------
    

    1Please upvote the linked answers and comments.