Search code examples
slurmhpc

How do I get the list of nodes allocated to the current job in SLURM?


I have a Software that requires a plain text list of nodes (once per task) where tasks are being sent. For example, if my job was launched with -n 4 -c 1, and I get 3 CPUs in node1 and 1 CPU in node2, I'd like to get a file such as:

node1
node1
node1
node2

How can I get such a list?

I tried using:

scontrol show hostnames $SLURM_JOB_NODELIST

But this only works if ALL the tasks are assigned to separate nodes. In the example above, this would just result on:

node1
node2

So the Software would only send one task to each node, and underutilize the CPUs allocated in node1.

Thanks! Miguel.


Solution

  • The easiest (though perhaps not most canonical) way is probably to run

    srun hostname > hostfile
    

    The file hostfile will contain the list of hostnames, with each hostname present as many times as the number of tasks that were allocated to that host.