I'm tryting to recognize lego bricks from video cam using opencv. It performs extremely bad comparing with just running detect.py in Yolov5. Thus I made some experiments about just recognizing images, and I found using openCV still performs dramatically bad as well, is there any clue? Here are the experiments I did.
This is the result from detect.py by just running
python detect.py --weights runs/train/yolo/weights/best.pt --source legos.jpg
This is the result from openCV by implementing this
import torch
import cv2
import numpy as np
model = torch.hub.load('.', 'custom', path='runs/train/yolo/weights/last.pt', source='local')
cap = cv2.VideoCapture('legos.jpg')
while cap.isOpened():
ret, frame = cap.read()
# Make detections
results = model(frame)
cv2.imshow('YOLO', np.squeeze(results.render()))
if cv2.waitKey(0) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
If I simply do this, it gives a pretty good result
import torch
results = model('legos.jpg')
results.show()
Any genious ideas?
Probably your model is trained with RGB images while opencv is using BGR format. Please try to convert the colour space accordingly. Example:
import torch
import cv2
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
# read image and convert to RGB
img = cv2.imread('zidane.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# make detections
results = model(img)
# render results and convert back to BGR
results.render()
out = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
cv2.imshow('YOLO', out)
cv2.waitKey(-1)
cv2.destroyAllWindows()