Search code examples
pythonobject-detectionyoloyolov5

Why are the two results different in YOLOv5?


I wanted to know the number of vehicles in the picture using yolov5 However, the result of the model was different from detect.py

0. img enter image description here.

1. model_result

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, custom

# Images
img = 'D:\code\YOLO\dataset\img\public02.png'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.

result -> (no detections)

2. detect.py

from IPython.display import Image
import os

val_img_path = 'D:\code\YOLO\dataset\img\public02.png'
 
!python detect.py --img 416 --conf 0.25 --source "{val_img_path}"

result -> enter image description here too

I know that if I don't specify weight option in detect.py, the default yolo5s model is used. but, Result 1 differs from Result 2 using the same model.


Solution

  • It seems it is an image processing issue.

    Indeed with your example, the model loaded from torch hub gives a different output than detect.py. Looking at the source code of detect.py I see that there is some good image pre-processing. From the model hub, I really don't know what's happening to the input. From the model hub, the resulting image is this:

    Torch hub model

    With their pre-processing, this is basically the image that you are feeding into the model. Would not expect any detections from this to be honest.

    But then I tried doing the pre-processing myself (noted as well in their tutorial)

    import torch
    import cv2
    
    # Model
    model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, custom
    
    # Image
    imgPath = '/content/9X9FP.png'
    img = cv2.imread(imgPath)[..., ::-1]  # Pre-processing OpenCV image (BGR to RGB)
    
    # Inference
    results = model(img)
    
    # Results
    results.save()
    
    

    And it all works fine:

    Good detections

    So for a quick and easy answer, I would just do the pre-processing my self, it's just a simple one line extra step. Good luck!