Search code examples
slurmhpc

unable to change slurm node status from inval to idle


I am trying to setup a slurm on a single node for which I am using Ubunutu 22.04(WSL). I followed and did the configuration as per the steps in the article at https://drtailor.medium.com/how-to-setup-slurm-on-ubuntu-20-04-for-single-node-work-scheduling-6cc909574365.

After the setup I can see the node in output of sinfo command however the state of the node initially is set to inval and I am trying to update the same to idle using command sudo scontrol update nodename=localhost state=idle however this command always fails and returns with error slurm_update error: Invalid node state specified.

Here is my slurm.conf file https://gist.github.com/kmoza/11c6a9cdef085bb14d9947b63ba95ef0 for the params I have configured.


Solution

  • This often arises when Slurm does not find on the nodes the resources it is expecting from the slurm.conf file.

    Compare the line from the configuration

    RealMemory=8135080 State=UNKNOWN SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2
    

    with the output of slurmd -C.

    Also in that case the logs of slurmctld should be explicit about this.