I am working on a project where I have trained a model using YOLO (from the Ultralytics library - version: 8.0.132) to detect specific objects in images. My classification categories are [A, B, C, D].
In my Jupyter Notebook, I have the following code:
from ultralytics import YOLO
model_path = "{pathToModel}/best.pt"
print("Loading model...")
model = YOLO(model_path)
result = model.predict("{pathToImage}.png", conf=0.3, save=True)
print(result)
# Checking the presence of predictions
for r in result:
print(r.masks)
This successfully saves an image showing the predicted segments with bounding boxes and probabilities (e.g., category B with a probability of 0.61).
However, when I modify the predict
method (by reading through the documentation: https://docs.ultralytics.com/de/modes/predict/) to exclude bounding boxes (boxes=False
), the saved image shows the segments without the boxes and crucially, without labels or probabilities.
Attempting to include labels and probabilities with labels=True
(same with show_labels
) and probs=True
results in the following error:
Traceback (most recent call last):
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\IPython\core\interactiveshell.py:3442 in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
Cell In[39], line 1
result = model.predict("{pathToMyImage}.png" , conf = 0.3, save = True, boxes=False, labels=True, probs=True)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\_contextlib.py:115 in decorate_context
return func(*args, **kwargs)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\yolo\engine\model.py:249 in predict
self.predictor = TASK_MAP[self.task][3](overrides=overrides, _callbacks=self.callbacks)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\yolo\v8\segment\predict.py:13 in __init__
super().__init__(cfg, overrides, _callbacks)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\yolo\engine\predictor.py:86 in __init__
self.args = get_cfg(cfg, overrides)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\yolo\cfg\__init__.py:114 in get_cfg
check_cfg_mismatch(cfg, overrides)
File c:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\yolo\cfg\__init__.py:187 in check_cfg_mismatch
raise SyntaxError(string + CLI_HELP_MSG) from e
...
Docs: https://docs.ultralytics.com
Community: https://community.ultralytics.com
GitHub: https://github.com/ultralytics/ultralytics
Seems like the combination of
show_boxes=False,
show_conf=True,
show_labels=True
does not work out (what is described in here: Unable to hide bounding boxes and labels in YOLOv8). show_boxes
throws and error and boxes
more or less overrides everything I set as an argument.
I followed the documentation but am struggling to display both the segment and its associated label/probability without the bounding box. Any insights or suggestions on how to achieve this would be greatly appreciated.
I wrote a small script in python to draw in the polygons correctly and showing the labels and confidence values. But this is a workaround for me. If there is a simpler solution in the arguments (as mentioned above) feel free to add your solution.
The code:
from torchvision.transforms import functional as F
import cv2
prediction_results = result[0]
# Convert the original image to a NumPy array and convert to RGB
original_image = Image.open(image_path)
if original_image.mode != 'RGB':
original_image = original_image.convert('RGB')
original_image = np.array(original_image)
display_image = original_image.copy()
if prediction_results.masks is not None:
# Run through each detection and application of color coding as well as adding text
for i, (mask_tensor, box) in enumerate(zip(prediction_results.masks.data, prediction_results.boxes)):
# Scaling the mask to the size of the original image
resized_mask = F.resize(mask_tensor.unsqueeze(0), display_image.shape[:2], antialias=True).squeeze() > 0
# Applying the color marker to the mask areas for each color channel
display_image[resized_mask, 0] = 255 # Red channel
display_image[resized_mask, 1] = 0 # Green channel
display_image[resized_mask, 2] = 0 # Blue channel
# Extract the class ID and confidence from the box
class_id = int(box.cls[0]) # Access to the class ID
confidence = box.conf[0] # Access to the confidence
class_name = prediction_results.names[class_id] # Class name based on the ID
# Adding the text
label = f"{class_name}: {confidence:.2f}"
coords = box.xyxy.cpu().numpy() # Convert to a NumPy array
x1, y1 = int(coords[0][0]), int(coords[0][1]) # Extract x1 and y1
cv2.putText(display_image, label, (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)
else:
print("No masks found or mask recognition not supported.")
# Convert to a PIL image and display
display_image = Image.fromarray(display_image)
plt.imshow(display_image)
plt.axis('off')
plt.show()