Search code examples
linuxcentoshpcslurm

Change CPU count for RUNNING Slurm Jobs


I have a SLURM cluster and a RUNNING job where I have requested 60 threads by

#SBATCH --cpus-per-task=60

(I am sharing threads on a node using cgroups)

I now want to reduce the amount of threads to 30.

$ scontrol update jobid=274332 NumCPUs=30
Job is no longer pending execution for job 274332

The job has still 60 threads allocated.

$ scontrol show job 274332
JobState=RUNNING Reason=None Dependency=(null)
NumNodes=1 NumCPUs=60 NumTasks=1 CPUs/Task=60 ReqB:S:C:T=0:0:*:*

How would be the correct way to accomplish this?

Thanks!


Solution

  • In the current version of Slurm, scontrol only allows to reduce the number of nodes allocated to a running job, but not the number of CPUs (or the memory).

    The FAQ says:

    Use the scontrol command to change a job's size either by specifying a new node count (NumNodes=) for the job or identify the specific nodes (NodeList=) that you want the job to retain.

    (Emphasis mine)