Search code examples
gccpytorchcudanvcc

Specify GCC version for nvcc without root priviledges


I am using a GPU cluster where the submitted jobs are managed by Slurm. I don't have admin / root priviledges on that server. I am currently trying to build a project that contains .cpp and .cu files. I do that by calling TORCH_CUDA_ARCH_LIST=7.2 CC=gcc-7 CXX=g++-7 python setup.py install, as the cluster uses CUDA 10.1 and runs V100 GPUs (hence the gencode is sm_70).

However, the build crashes with the following error message:

building <filename> extension
gcc-7 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes (...): 
error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
  138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
      |  ^~~~~
error: command '/<somepath>/anaconda3/envs/pytorch14/bin/nvcc' failed with exit status 1

So, as one can see by the gcc-7 call in the 2nd line, the python script is using the right compiler, but unfortunately, the nvcc call uses the system-wide gcc symlink, which is: /usr/bin/gcc: symbolic link to gcc-9. I have found a couple of answers online (including this and this) and have tried the suggested steps. But: as I don't have root access, I cannot create a new symlink / change the existing symlink to another installed gcc version, e.g. /usr/bin/gcc-7: doing ln -s /usr/bin/gcc-7 /usr/bin/gcc gives me a ln: failed to create symbolic link '/usr/bin/gcc': File exists error, and copying the files into /usr/local/bin, as suggested in other answers on SO, wont work either because of the missing priviledges.

I'm really at a loss here and feel that this might be a dead end. Does anybody have any suggestions?

For reference, this is what my setup.py looks like:

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='noise_cuda',
    ext_modules=[
        CUDAExtension('noise_cuda', [
            'noise_cuda.cpp',
            'noise_cuda_kernel.cu',
        ]),
    ],
    cmdclass={
        'build_ext': BuildExtension
    })

Solution

  • I'm not a pytorch user, but if I read the docs right, this should work:

    
    import sysconfig
    from setuptools import setup
    from torch.utils.cpp_extension import BuildExtension, CUDAExtension
    
    setup(
        name='noise_cuda',
        ext_modules=[
            CUDAExtension('noise_cuda', [
                'noise_cuda.cpp',
                'noise_cuda_kernel.cu',
            ], extra_compile_args={'cxx': sysconfig.get_config_var('CFLAGS').split(), 
                                   'nvcc': ['-ccbin=/usr/bin/gcc-7']}),
        ],
        cmdclass={
            'build_ext': BuildExtension
        })