Search code examples
slurm

How do I configure to use shard in slurm?


`How to use Shard configuration in slurm?

slurm.conf: GresTypes=gpu,shard \ NodeName=ubuntu-deeplearning-2602011 NodeAddr=10.26.2.11 Gres=gpu:10,shard:1000 CPUs=48 Boards=1 SocketsPerBoard=2 CoresPerSocket=12 ThreadsPerCore=2 RealMemory=257589 State=UNKNOWN gres.conf: #AutoDetect=nvml \ NodeName=ubuntu-deeplearning-2602011 Name=gpu File=/dev/nvidia[0-9] \ Name=shard Count=1000 File=/dev/nvidia[0-9]

test.sh: ... #SBATCH --gres=gpu:1 #How do I configure to use shard in test.sh? ...`


Solution

  • You can use shards instead of whole gpus like this:

    test.sh: ... #SBATCH --gres=shard:100
    

    In my understanding one is not supposed to combine the two ways in a single job, so probably if you want to use a fraction of a single GPU, you should use shards, if you want 1 or more gpus, use gpu.

    This answer is based on https://groups.google.com/g/slurm-users/c/C1CLyPpD1e0