python pytorch cuda huggingface-transformers llama

RuntimeError: CUDA error: no kernel image is available for execution on the device for cuda 11.8 and torch 2.0.0

I wanted to use meta-llama/Llama-2-13b-chat-hf, but I am having this error:

RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

The output of nvidia-smi is:

| NVIDIA-SMI 465.19.01    Driver Version: 465.19.01    CUDA Version: 11.3     |

NVCC:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation

Built on Wed_Sep_21_10:33:58_PDT_2022

Cuda compilation tools, release 11.8, V11.8.89

Build cuda_11.8.r11.8/compiler.31833905_0

Torch Version:

torch==2.0.0+cu118
torchaudio==2.0.1+cu118
torchvision==0.15.1+cu118

Transformers Version:

transformers==4.37.2

I have a some 2080ti and a 710 and am using Ubuntu 16.

I also got this in my output:

Found GPU9 NVIDIA GeForce GT 710 which is of cuda capability 3.5.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is 3.7.

I was downloading torch versions from here.

I tried building from the source as well but it gave me the same output.

The output from bitsandbytes:


++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = True
COMPUTE_CAPABILITIES_PER_GPU = ['7.5', '7.5', '7.5', '7.5', '7.5', '7.5', '7.5', '7.5', '7.5', '3.5']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable


WARNING: Please be sure to sanitize sensible info from any such env vars!

SUCCESS!
Installation was successful!

I ran this code to test torch

import torch
import sys
print('A', sys.version)
print('B', torch.__version__)
print('C', torch.cuda.is_available())
print('D', torch.backends.cudnn.enabled)
device = torch.device('cuda')
print('E', torch.cuda.get_device_properties(device))
print('F', torch.tensor([1.0, 2.0]).cuda())

It gave me this output:

A 3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0]
B 2.0.0+cu118
C True
D True
    UserWarning: 
    Found GPU9 NVIDIA GeForce GT 710 which is of cuda capability 3.5.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is 3.7.
    
  warnings.warn(old_gpu_warn % (d, name, major, minor, min_arch // 10, min_arch % 10))
E _CudaDeviceProperties(name='NVIDIA GeForce RTX 2080 Ti', major=7, minor=5, total_memory=11019MB, multi_processor_count=68)
F tensor([1., 2.], device='cuda:0')

What should I do to fix this?

Solution

As mentioned in the warning:

    UserWarning: 
    Found GPU9 NVIDIA GeForce GT 710 which is of cuda capability 3.5.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability supported by this library is 3.7.

The main issue was that torch was trying to run code on the GT710 and due to it being not supported, the program crashed. To fix this I added an environment variable:

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8

So torch just ignores the GT710 now and it runs perfectly fine for me now.