Search code examples
pythonpytorchyolov8

Ultralytics YOLO: Error when trying to extract bounding box coordinates from Results.boxes


I am using Ultralytics YOLO for license plate detection, and I'm encountering an issue when trying to extract bounding box coordinates from the Results.boxes object. I have inspected the structure of the Results.boxes object, but I am having difficulty accessing the bounding box information correctly.

class ImageProcessing:
    def __init__(self, model_path: Path, input_image: Path, output_image: Path):
        if not isinstance(model_path, Path):
            raise TypeError("model_path must be a pathlib.Path instance")
        if not isinstance(input_image, Path) or not isinstance(output_image, Path):
            raise TypeError("input_image and output_image must be pathlib.Path instances")
        # Load the YOLO model from the provided path
        self.model = YOLO(str(model_path))
        self.input_image = input_image
        self.output_image = output_image

    def ascertain_license_plates_as_image(self, threshold: float = 0.5, fontscale: float = 1.3, color: tuple = (0, 255, 0), thickness: int = 3):
        image = opencv.imread(str(self.input_image))
        results = self.model(image)

        # Check if results is a list and get the first result
        if isinstance(results, list):
            results = results[0]

        # Iterate through each detected object
        for box in results.boxes:
            # Extract coordinates, confidence, and class ID
            x1, y1, x2, y2, conf, class_id = box.data[0][0], box.data[0][1], box.data[0][2], box.data[0][3], box.conf.item(), int(box.cls.item())
            if conf > threshold:
                opencv.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), color, thickness)
                label = results.names[class_id].upper() if results.names else f'class {class_id}'
                opencv.putText(image, label, (int(x1), int(y1) - 10), opencv.FONT_HERSHEY_SIMPLEX, fontscale, color, thickness, opencv.LINE_AA)

        opencv.imwrite(str(self.output_image), image)
        return results

However, I'm getting an IndexError, and it seems that my indexing might be incorrect for this particular Boxes object. Or even worse, cv2 is not highlighting the license plate.


Solution

  • Considering results = results[0], for box in results.boxes will return an ultralytics.engine.results.Boxes object(s) with the following attributes (the values are given as an example):

    cls: tensor([15.], device='cuda:0')
    conf: tensor([0.5666], device='cuda:0')
    data: tensor([[5.7743e+02, 1.3452e+02, 2.5194e+03, 2.7360e+03, 5.6664e-01, 1.5000e+01]], device='cuda:0')
    id: None
    is_track: False
    orig_shape: (2736, 3648)
    shape: torch.Size([1, 6])
    xywh: tensor([[1548.4325, 1435.2581, 1942.0007, 2601.4839]], device='cuda:0')
    xywhn: tensor([[0.4245, 0.5246, 0.5323, 0.9508]], device='cuda:0')
    xyxy: tensor([[ 577.4322,  134.5160, 2519.4329, 2736.0000]], device='cuda:0')
    xyxyn: tensor([[0.1583, 0.0492, 0.6906, 1.0000]], device='cuda:0')
    

    It will be better to call them as it is, unpacking from the torch.Tensor where it is necessary:

    results = results[0]
    for box in results.boxes:
      x1, y1, x2, y2 = box.xyxy.tolist()[0]
      conf = box.conf.item()
      class_id = int(box.cls.item())
    
      # print([x1, y1, x2, y2], conf, class_id)
      # output: [577.4321899414062, 134.5160369873047, 2519.432861328125, 2736.0] 0.5666425228118896 15