Search code examples
pytorchcudanvccnvidia-smi

Getting CUDA version correctly reported by nvcc


I am trying to update CUDA in Ubuntu. Following the guide here, my initial set up had the CUDA version reported as:

  • via nvcc - Cuda compilation tools, release 10.1, V10.1.243
  • via nvidia-smi - 11.1

After the error free update the CUDA versions were reported as:

  • via nvcc - Cuda compilation tools, release 10.1, V10.1.243
  • via nvidia-smi - 12.1

The difficulty this causes is that when I try to install say, torch-cluster I get an error:

RuntimeError:
      The detected CUDA version (10.1) mismatches the version that was used to compile
      PyTorch (11.7). Please make sure to use the same CUDA versions.

Based on recommendations here, I expressly downloaded and installed the Toolkit v11.7 from here and tried to do an install but got this message:

$ sudo apt-get -y install cuda
Reading package lists... Done
Building dependency tree       
Reading state information... Done
cuda is already the newest version (12.1.1-1).

But nvcc continues to report:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

How do I resolve this? (this may be a similar question that isn't answered yet)


Solution

  • This ended up being partly a $PATH issue for nvcc -V to report correctly and partly a library issue.

    First the $PATH. This is a question of updating your $PATH variable to include your most current location of the CUDA Toolkit. nvidia-smi indicates that the correct driver is in place so you just need to make sure that the toolkit is present ie. does /usr/local/cuda-xx.x exist. Two possibilities:

    • if it exists then just modify your $PATH. Easiest way is to look at the current path echo $PATH, then copy the entire output string to a text editor, change it so it includes :/usr/local/cuda-xx.x/bin (delete any old version if it exists), and then copy the changed string in to export PATH=$PATH:<your new string>. Yes, you can just add to the path but if you do it this way you can see exactly what your path is and what has changed rather than just blindly adding to an existing path.
    • if the toolkit of the version you want is not present then install it (see next item)

    In this case the toolkit I was missing for library installations was 11.7. After the driver installation I had /usr/local/cuda-10.1 (old) which nvcc was reporting and /usr/local/cuda-12.1 from the install but I needed 11.7. Turns out multiple libraries can co-exist. The solution was:

    • install 11.7 with sudo apt-get install cuda-toolkit-11-7
    • make sure it is discoverable with export CUDA_HOME=/usr/local/cuda-11.7/