Search code examples
bashwhile-loopslurm

Script to submit multiple slurm jobs from different, although matching, subdirectories


Lately I've been generating several series of input files to run calculations on a HPC using a Slurm queuing system. These calculations are all placed in subdirectories of the form D001, D002, etc. (D, followed by a three digit number, in numerical order). In these directories are my input files, always ending in .inp (for the series I'm looking at now, the files are all called tma.h2s-2-vpt2-b97m-d4-qz_D001.inp, etc., where the number in the input file matches the name of its subdirectory), and Slurm run scripts (these particular ones are all called slurm-run-orca.job).

So far, I've been laboriously running each of the calculations manually, by running (from the directory each of the subdirectories are placed in):

cd D001
sbatch -J tma.h2s-2-vpt2-b97m-D4-qz_D001 slurm-run-orca.job
cd ../D002
sbatch -J tma.h2s-2-vpt2-b97m-D4-qz_D002 slurm-run-orca.job
cd ../D003
sbatch -J tma.h2s-2-vpt2-b97m-D4-qz_D003 slurm-run-orca.job

etc.

This is very time consuming and, to be honest, makes my fingers very stiff. The last series I ran had 66 calculations, which I did get through eventually, but my current batch has 84...

Is there any way I can write a script to do this automatically? (Note, the input files are submitted without the .inp appended, despite the files all including a .inp extension.) I've searched for days to find a similar question/answer to this situation, but, funnily enough, I've not found anything that treats a problem this simple (at least, this problem seems simple — I could very well be mistaken, though)...

Edit: To clarify, the only files in the subdirectories are the job files and the input files, i.e.:

ls
D001   D002   D003   D004 ...

ls D001/
tma.h2s-2-vpt2-b97m-d4-qz_D001.inp   slurp-run-orca.job

So far, I have tried several things — if/then statements, while and for loops — but the fact is that I don't know enough about bash to get something like this to work (I can edit scripts, but I've never successfully written a script from scratch). The closest I've gotten to making the script work is:

#!/bin/bash

work_dir="pwd"
base=$1

n=001
submit_dir=$base$(( n ))

while ii in $submit_dir ; do
        cd $submit_dir
        squeue -J *.inp *.job
        cd $work_dir
        n=$(( $n + 1 ))
echo "Submitting calculations to queue"
done

I tried to add an if/then statement to make sure the loop would stop when it reached the end of the series (I have a feeling that would be unnecessary though), and to print out a Could not find calculation subdirectories when relevant, but then started focussing on getting the while loop to work. I am aware that what I have written so far will include the .inp extension, but I thought it would be worth getting the script to recognise that there were files in the directories first, and then worry about stripping the .inp...

Additionally, if possible, I would like to be able to submit the calculations in batches of 20, either with a specific time delay or by using a command like ./job-submission <base-name-of-input-files> 001-020 (for example), just so I don't clog up the queue (we have a very limited number of nodes available to us).

Any help getting this to work would be very appreciated!


Solution

  • Here's how I would do it:

    #!/usr/bin/env bash
    shopt -s nullglob
    
    for workdir in "$@"
    do
        pushd "$workdir" > /dev/null || continue
    
        for inpfile in *_"${PWD##*/}".inp
        do
            sbatch -J "${inpfile%.*}" slurm-run-orca.job
        done
    
        popd > /dev/null
    done
    

    Then you specify the directories from which you want to submit a job as argument:

    ./myscript.sh some/path/to/D*
    

    Notes
    • pushd and popd are bash builtins that you can use for cding to a directory and then return to the previous location.

    • The environment variable PWD contains the path of the current directory; ${PWD##*/} will expand to the last component of it (ie. everything up to the last / is stripped).