Search code examples
pythonpytorchyolo

NotImplementedError: Could not run 'torchvision::nms' despite successful setup and device discovery


I'm trying to train a model on my laptop using the GPU on Windows 10 using pyCharm, so i followed some guides online on how to properly setup my NVidia drivers, cuda, cuDNN and what versions of libraries to install. Apparently the setup process was successful and am able to discover the GPU and get info about it. However, if i try to train a model i get the error:

NotImplementedError: Could not run 'torchvision::nms with arguments from the 'CUDA' backend.'

I've setup torch, torchvision and torchaudio as directed on their website

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

I have installed the latest NVidia driver, CUDA version 12.1 and cuDNN and made sure the paths are correct in environment variables. I restarted my pc, still nothing.

If i try to check my GPU info, it works as expected.

import torch

print(f'CUDA version: {torch.version.cuda}')
cuda_id = torch.cuda.current_device()
print(f"ID of current CUDA device:{torch.cuda.current_device()}")
print(f"Name of current CUDA device:{torch.cuda.get_device_name(cuda_id)}")
print(f'Torch version: {torch.__version__}')

Output:

CUDA version: 12.1
ID of current CUDA device:0
Name of current CUDA device:NVIDIA GeForce GTX 1060 with Max-Q Design
Torch version: 2.1.1+cu121

So the Torch version is correct and corresponds with my CUDA version.

The code i try to train with:

from ultralytics import YOLO
from datetime import datetime, timedelta

model = YOLO('yolov8n.pt')

startTime = datetime.now()
results = model.train(data='data.yaml', epochs=15,
                      imgsz=[480, 352])
endTime = datetime.now()
delta = endTime - startTime
print('\n\n\n')
print(f'Model training took {delta}')
print(f'Model training took {delta.seconds}')

Could this still be an issue of improper installation? Might it be caused by pyCharm? Am at a loss.


Solution

  • It seems that the issue must have been caused by pip. I gave up on trying to get it working and gave in and installed Anaconda.

    I created a new environment, set up pyTorch as directed, installed all other dependencies and it worked as expected.