OS: Ubuntu 18.04 LTS
CUDA: 11.3
GPU: NVIDIA P5000 Quadro
IDE: Jupyter Notebook
Environment: VirtualEnv (venv)
Code:
# Importing the required libraries
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
# Defining the name of the Falcon model
model_name = "ybelkada/falcon-7b-sharded-bf16"
# Configuring the BitsAndBytes quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
# Loading the Falcon model with quantization configuration
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
trust_remote_code=True
)
# Disabling cache usage in the model configuration
model.config.use_cache = False
Error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/torch/cuda/__init__.py:242, in _lazy_init()
241 try:
--> 242 queued_call()
243 except Exception as e:
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/torch/cuda/__init__.py:122, in _check_capability()
116 old_gpu_warn = """
117 Found GPU%d %s which is of cuda capability %d.%d.
118 PyTorch no longer supports this GPU because it is too old.
119 The minimum cuda capability supported by this library is %d.%d.
120 """
--> 122 if torch.version.cuda is not None: # on ROCm we don't want this check
123 CUDA_VERSION = torch._C._cuda_getCompiledVersion()
AttributeError: module 'torch' has no attribute 'version'
The above exception was the direct cause of the following exception:
DeferredCudaCallError Traceback (most recent call last)
Cell In[10], line 17
10 bnb_config = BitsAndBytesConfig(
11 load_in_4bit=True,
12 bnb_4bit_quant_type="nf4",
13 bnb_4bit_compute_dtype=torch.float16,
14 )
16 # Loading the Falcon model with quantization configuration
---> 17 model = AutoModelForCausalLM.from_pretrained(
18 model_name,
19 quantization_config=bnb_config,
20 trust_remote_code=True
21 )
23 # Disabling cache usage in the model configuration
24 model.config.use_cache = False
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py:563, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
561 elif type(config) in cls._model_mapping.keys():
562 model_class = _get_model_class(config, cls._model_mapping)
--> 563 return model_class.from_pretrained(
564 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
565 )
566 raise ValueError(
567 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
568 f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
569 )
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/transformers/modeling_utils.py:3053, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3049 hf_quantizer.validate_environment(
3050 torch_dtype=torch_dtype, from_tf=from_tf, from_flax=from_flax, device_map=device_map
3051 )
3052 torch_dtype = hf_quantizer.update_torch_dtype(torch_dtype)
-> 3053 device_map = hf_quantizer.update_device_map(device_map)
3055 # Force-set to `True` for more mem efficiency
3056 if low_cpu_mem_usage is None:
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/transformers/quantizers/quantizer_bnb_4bit.py:246, in Bnb4BitHfQuantizer.update_device_map(self, device_map)
244 def update_device_map(self, device_map):
245 if device_map is None:
--> 246 device_map = {"": torch.cuda.current_device()}
247 logger.info(
248 "The device_map was not initialized. "
249 "Setting device_map to {'':torch.cuda.current_device()}. "
250 "If you want to use the model for inference, please set device_map ='auto' "
251 )
252 return device_map
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/torch/cuda/__init__.py:552, in current_device()
550 def current_device() -> int:
551 r"""Returns the index of a currently selected device."""
--> 552 _lazy_init()
553 return torch._C._cuda_getDevice()
File ~/FYP_Chatbot/test02/src/myenv-test002-02/lib/python3.9/site-packages/torch/cuda/__init__.py:246, in _lazy_init()
243 except Exception as e:
244 msg = (f"CUDA call failed lazily at initialization with error: {str(e)}\n\n"
245 f"CUDA call was originally invoked at:\n\n{orig_traceback}")
--> 246 raise DeferredCudaCallError(msg) from e
247 finally:
248 delattr(_tls, 'is_initializing')
DeferredCudaCallError: CUDA call failed lazily at initialization with error: module 'torch' has no attribute 'version'
Environment Packages:
accelerate==0.29.1
bitsandbytes==0.43.0
datasets==2.18.0
einops==0.7.0
fsspec==2023.10.0
peft @ git+https://github.com/huggingface/peft.git@26726bf1ddee6ca75ed4e1bfd292094526707a78 torch==1.13.0 transformers==4.39.3
trl==0.8.1
wandb==0.16.6
I encountered the error after downgrading PyTorch 2.2.2
to PyTorch 1.13.0
. I had to downgrade PyTorch 2.2.2
because of the fact that I have cuda toolkit of version 11.3 which was not compatible with the later versions of PyTorch. I downgraded PyTorch to version 1.13.0
specifically because I am using "transformers" library from huggingface which requires PyTorch version >= 1.13.0
.
Nvidia Graphics Cards Details (nvidia-smi):
Sat Apr 6 22:40:45 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03 Driver Version: 470.182.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P5000 Off | 00000000:01:00.0 On | Off |
| 27% 44C P8 6W / 180W | 295MiB / 16275MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 938 G /usr/lib/xorg/Xorg 103MiB |
| 0 N/A N/A 1150 G /usr/bin/gnome-shell 37MiB |
| 0 N/A N/A 1986 G /usr/lib/firefox/firefox 150MiB |
+-----------------------------------------------------------------------------+
nvcc -V:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0
According to the information given in your post, you have two requirements:
the transformers
package you are trying to use requires a PyTorch version of >= 1.13.0
the maximum CUDA version supported by your GPU is 11.4
, so the CUDA toolkit installed via PyTorch must be <= 11.4
That CUDA version was decommissioned on the release of PyTorch 1.13
, see release notes. So a viable solution for you would be to see whether you can update your CUDA driver to 11.6
or 11.7
. With that version, you will be able to install PyTorch 1.13
with the appropriate CUDA toolkit
.