Search code examples
deep-learningobject-detectionbounding-boxyolov8roboflow

What kind of default image preprocessing is applied in Yolov8 model?


I am currently working with Ultralytics Yolov8 model for object detection. I trained my model with a custom dataset. I annotated my dataset on Roboflow and exported in Yolov8 format. The training result looks good. However there is a mismatch between the coordinates of bounding boxes of my manual annotation on Roboflow and that of prediction.

For example I have this image called image_1.jpg which is annotated on Roboflow and is part of training set. Roboflow defined the coordinates (as the part of the label):

[0 0.36953125 0.39609375 0.54765625 0.7875]

where the last 4 numbers showing the x1,x2,y1,y2 coordinates of the bounding boxes. However when I passed this image into predict method (just for checking) using the following piece of code

model = YOLO(‘best_weights.pt’)
result = model.predict(path_of_example_image, save = True)
boxes = result[0].boxes.xyxy

I got the coordinates:

tensor([[ 22.5558, 9.7429, 619.7601, 345.7614]])

When I draw the bounding box with the predicted coordinates, the area covers the object completely (with .99% accuracy on the (correctly) predicted class). On the other hand, when I draw the bounding box with the coordinates from manual annotation, the object is not covered at all as, obviously, all coordinates are below 1. So I just wonder what kind of preprocessing is applied within the model so that the coordinates from Roboflow's labelling make sense and my training result is actually good?


Solution

  • Yolov8 tensor can output normalised values too. FYI, xyxy is not yolo format, you would need xywh instead. Just change this line:

    boxes = result[0].boxes.xyxy
    

    to:

    boxes = result[0].boxes.xywh