Search code examples
machine-learningpytorchartificial-intelligenceyoloyolov8

YoloV8 results have no 'box', 'max' properties in it


I've trained a YOLOV8 model to identify objects in an intersection (ie cars, roads etc). It is working OK and I can get the output as an image with the objects of interested segmented.

However, what I need to do is to capture the raw geometries (polygons) so I can save them on a txt file later on.

I tried what Ive found in the documentation (https://docs.ultralytics.com/modes/predict/#key-features-of-predict-mode) however the returning object is not the same as the documentation says.

In fact, the result is a list of tensorflow numbers:

enter image description here

Here's my code:

import argparse
import cv2
import numpy as np
from pathlib import Path
from ultralytics.yolo.engine.model import YOLO    
    
# Parse command line arguments
parser = argparse.ArgumentParser()
parser.add_argument('--source', type=str, required=True, help='Source image directory or file')
parser.add_argument('--output', type=str, default='output', help='Output directory')
args = parser.parse_args()

# Create output directory if it doesn't exist
Path(args.output).mkdir(parents=True, exist_ok=True)

# Model path
model_path = r'C:\\_Projects\\best_100img.pt'

# Load your model directly
model = YOLO(model_path)
model.fuse()

# Load image(s)
if Path(args.source).is_dir():
    image_paths = list(Path(args.source).rglob('*.tiff'))
else:
    image_paths = [args.source]

# Process each image
for image_path in image_paths:
    img = cv2.imread(str(image_path))
    if img is None:
        continue

    # Perform inference
    predictions = model.predict(image_path, save=True, save_txt=True)
    
print("Processing complete.")

Here's the problem: the return object (predictions variable) has no boxes, masks, keypoints and etc.

I guess my questions are:

  • Why the result is so different from the documentation?
  • Is there a conversion step?

Solution

  • As it comes from the comments, you are using an old version of Ultralytics==8.0.0. It in fact returns the result as a list of torch.Tensor object instead of ultralytics.engine.results.Results object, and exactly the last one has such parameters like boxes, masks, keypoints, probs, obb. The documentation complies with the latest framework version, 8.2.24 for now, and 8.0.0 is from January 2023.

    The easiest way to solve your problem is to upgrade the Ultralytics version to the latest one, so you will get all the results parameters described in the documentation.

    If circumstances prevent you from this update, you will need some data postprocessing with the understanding of the returned results format from the 8.0.0 version.

    OBJECT DETECTION task results, version==8.0.0

    # for 3 detected objects
    
    [tensor([[2.89000e+02, 7.10000e+01, 1.44000e+03, 5.07000e+02, 8.91113e-01, 2.00000e+00],
             [1.26700e+03, 6.00000e+01, 1.68200e+03, 3.19000e+02, 8.31055e-01, 2.00000e+00],
             [6.96000e+02, 0.00000e+00, 1.32200e+03, 1.31000e+02, 2.56836e-01, 7.00000e+00]], device='cuda:0')]
    
    # where every array stores 6 values, first 4 are in pixels:
    [x_centre, y_centre, box_width, box_height, confidence, class_id]
    
    # for easy manipulation you can run results[0].tolist() and will get the following format:
    [[289.0, 71.0, 1440.0, 507.0, 0.89111328125, 2.0],
     [1267.0, 60.0, 1682.0, 319.0, 0.8310546875, 2.0],
     [696.0, 0.0, 1322.0, 131.0, 0.2568359375, 7.0]]
    

    OBJECT SEGMENTATION task results, version==8.0.0. Will be the same as for detection, but adds the second torch.Tensor with the segmentation masks for every object.

    # for 3 detected objects
    
    [[tensor([[1.23000e+02, 8.90000e+01, 4.21000e+02, 2.21000e+02, 2.55216e-01, 7.00000e+00],
              [1.26700e+03, 5.80000e+01, 1.68100e+03, 3.17000e+02, 8.04158e-01, 2.00000e+00],
              [2.70000e+02, 7.70000e+01, 1.46000e+03, 4.98000e+02, 8.19106e-01, 2.00000e+00]], device='cuda:0'),
      tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               ...,
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.]],
      
              [[0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               ...,
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.]],
      
              [[0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               ...,
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.],
               [0., 0., 0.,  ..., 0., 0., 0.]]], device='cuda:0')]]
    
    

    Knowing the data format you receive from this old Ultralytics version, you can easily reach them and translate them to the format you need. But upgrading Ultralytics to the latest version is still the best way to get the most effective usage of this framework.

    Install Ultralytics: https://docs.ultralytics.com/quickstart/