Search code examples
pythonlinuxsshslurmsbatch

How to SSH to a SLURM scheduler and execute SLURM commands using a Python script rather than the CMD to run a Python script?


Currently I am connecting to the SLURM scheduler and running the requested SLURM commands (such as sbatch and squeue) in the following manner (using Ubuntu 22.04):

  • SSH to the SLURM job scheduler using the necessary credentials in the CMD
  • Submit the SLURM command sbatch mysbatchscript.sh
  • Any other SLURM commands such as squeue are also submitted in the CMD as done previously

At the end of this sbatch script called mysbatchscript.sh I also have the line python3 main_deeplearning.py . The aim of this sbatch script is to run a deep learning model using CUDA.

My questions are as follows;

  • Is there a way how I can initiate the SSH connection and also be able to run those SLURM commands in the same Python script rather than keep using the CMD?
  • When executing the SLURM command sbatch mysbatchscript.sh I would also like to pass arguments found in my Python script. Is it possible that the sbatch command accepts Python arguments i.e. sbatch mysbatchscript.sh arg1 arg2 where arg1 and arg2 are input arguments to a deep learning model found in the main_deeplearning.py Python script?

The only Python module that I found that comes close to interacting with the SLURM scheduler is called Pyslurm but I do not know how to connect this to the SSH connection which has to be done prior to using the Pyslurm methods and I do not know how to connect this to the SLURM job scheduler.

Thus, up to now I kept using the 'traditional' method of connecting to SSH and running the SLURM commands such as sbatch and squeue through the Ubuntu terminal.


Solution

  • Yes, you can use the paramiko library in Python to do this from your local machine.

    In general, your script can look like this and execute whatever is on client side:

    import paramiko
    import getpass
    import time
    
    
    password = getpass.getpass('Enter your password: ')
    
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect('servername', username='<your_username>', password=password)
    

    From this you can submit sbatch files by knowing the path on the server. This can be done like so:

    stdin, stdout, stderr = ssh.exec_command('sbatch path_to_script.job')
    job_id = int(stdout.read().decode().split()[-1])
    print(f'Submitted job with ID {job_id}')