I have a Software that requires a plain text list of nodes (once per task) where tasks are being sent. For example, if my job was launched with -n 4 -c 1
, and I get 3 CPUs in node1
and 1 CPU in node2
, I'd like to get a file such as:
How can I get such a list?
I tried using:
scontrol show hostnames $SLURM_JOB_NODELIST
But this only works if ALL the tasks are assigned to separate nodes. In the example above, this would just result on:
So the Software would only send one task to each node, and underutilize the CPUs allocated in node1
Thanks! Miguel.
The easiest (though perhaps not most canonical) way is probably to run
srun hostname > hostfile
The file hostfile
will contain the list of hostnames, with each hostname present as many times as the number of tasks that were allocated to that host.