Search code examples
huggingface

HuggingFace Accelerate won't use the proper number of processes


The server I'm using has a total of 4 GPUs and I want to use 2. I'm therefore trying to do accelerate launch --num_processes 2 train.py, but when I run the script it says that the number of processes is only 1. Why is this happening? It seems as of now I can either only use all 4 or only 1.


Solution

  • When using the command line interface, you also need to pass the --multi_gpu argument to enable distributed training. However the documentation recommends using accelerate config to facilitate the configuration.