I trained a YOLOv5 model from a custom dataset with the provided training routine on github (from inside tutorial.ipynb). Using this model for detecting objects in unseen images gets me decent results when executing:
!python detect.py --weights custom_weights.pt --img 224 --conf 0.5 --source data/images
Now I want to use my model in a small project. Using the following approach does not lead to good results. It either detects complete nonsense or nothing at all on the same images used above.
model = torch.hub.load('ultralytics/yolov5', 'custom', path='custom_weights.pt', force_reload=True)
img = cv2.imread('test.jpg')
model.eval()
pred = model(img)
bboxes = pred.xyxy
Am I forced to use detect.py and hence cloning the whole YOLO repository into my project? Or is there something I am missing when calling the model like I do right now?
Seems like calling the model with a numpy array is not working, at least not straightforward. When the image is transformed to a torch.tensor, you can apply non maximum supression (from yolov5/utils/general.py) to the output and receive a valid result:
model = torch.hub.load('ultralytics/yolov5', 'custom', path='custom_weights.pt', force_reload=True)
img = cv2.imread('image.jpg')
img = torch.from_numpy(img)/255.0
img = img.unsqueeze(0)
img = torch.permute(img, (0, 3, 1, 2))
pred = model(img)
pred = non_max_suppression(pred)