Search code examples
pythonhuggingface-transformers

How to enable CUDA for Huggingface Trainer on Windows?


I am trying to use the Trainer from the transformers library from HuggingFace in Python:

from transformers import Seq2SeqTrainingArguments
from transformers import Seq2SeqTrainer

#  ...

training_args = Seq2SeqTrainingArguments(fp16=True,

# ...

trainer = Seq2SeqTrainer( args=training_args,

# ...

I get this error message:

ValueError: FP16 Mixed precision training with AMP or APEX (--fp16) and FP16 half precision evaluation (--fp16_full_eval) can only be used on CUDA or NPU devices or certain XPU devices (with IPEX).

It seems like I am missing some CUDA installation, but I can't figure out, what exactly I need. I tried (without success):

py -m pip install --upgrade setuptools pip wheel
py -m pip install nvidia-pyindex
py -m pip install nvidia-cuda-runtime-cu12

py -m pip install nvidia-nvml-dev-cu12
py -m pip install nvidia-cuda-nvcc-cu12

System info:

  • Windows 11 Build 22621
  • Python 3.11.7, running inside a venv
  • Geforce RTX 4070

Thanks for any ideas!


Solution

  • Found it. Adding this line to the code

    model.to('cuda')
    

    gave a more meaningful error message:

    Torch not compiled with CUDA enabled

    Created the correct pip install command here: https://pytorch.org/get-started/locally/ (something like pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 ) and it worked.