Cannot convince Pytorch to install with Cuda (Windows 11)

I am trying to install PyTorch with Cuda using Anaconda3, on Windows 11:

My GPU is RTX 3060.
My conda environment is Python 3.10.13.
nvidia-smi outputs Driver Version: 551.23, CUDA Version: 12.4.

What I tried:

Following the instructions on https://pytorch.org/get-started/locally/, picking CUDA 12.1 (and 11.8)
Reading through a bunch of posts on https://discuss.pytorch.org/ (including https://discuss.pytorch.org/t/torch-cuda-is-available-gives-false/197333)
Reading through a bunch of posts on SO (e.g. PyTorch: CUDA is not available)

Many articles say that I don't need to have a independently installed CUDA so I uninstalled the system-wide version - but it predictably didn't make a difference. No matter what I try, torch.cuda.is_available() is always false.

I think my conclusion is that the nVidia driver must match the version of CUDA, but for me it's counterintuitive. I've always trained myself to install the latest and greatest nVidia driver (I use my machine for gaming as much as I want to use it for ML) - is my understanding correct, and the only way to get PyTorch to work with CUDA is to downgrade my GPU driver from 551.23 to something that supports CUDA 12.1?

Solution

Solution - find and uninstall non-CUDA torch installations!

In my case, the issue was caused by another system-wide installation of torch. Before I started using Anaconda, I was under the silly impression that I can get away with using a vanilla Python. Whatmore, I was using a mixture of system-wide and user-installed packages - a nice recipe for disaster.

@talonmies offered a great tip: printing out the output of torch.cuda.get_arch_list(). In my case, it was an empty list [], indicating a wrong torch library was loaded.

I ran pip list using regular Python outside of the Anaconda environment and lo and behold, I had the non-cuda torch installed! I have used pip to uninstall everything that had to do with torch/pytorch (and noticed some weird ~orch leftovers that I forcefully purged) - and then all was good.

Lesson learned - when doing complex things with Python (e.g. ML) always start with a virtual environment (Anaconda, Miniconda, VirtualEnv, what have you) or you will regret it later.