Iterative slurm job

I'm trying to optimize a study I'm doing. I currently have to job scripts I call them step1 and step2. In step1

#!/bin/bash

#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00

module load <everything I need>



echo "Start of program at `date`"

srun $HOME/project/bin/my_executable1 ../data/my_datafile0.dat

echo "End of program at `date`"

After this job is done I have a new datafile that we can call my_datafile1.dat and this goes into the the second job script step2:

#!/bin/bash

#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00

module load <everything I need>



echo "Start of program at `date`"

srun $HOME/project/bin/my_executable1 ../data/my_datafile1.dat

echo "End of program at `date`"

After this job I have a new datafile called my_datafile2.dat which I use in step1 again and then the new one in step2 etc. I'm wondering if there is a way to write a job script which does this iteration for me. I would like to tell it to do 20 iterations and then I'll end up with my_datafile1.dat, my_datafile2.dat, ..., my_datafile20.dat.

Solution

In a single job? If so, you could just use a loop, like follows (with a bit of verbosity in the inner loop, but can be replace by a single line if wanted).

Edit after the clarification in the comments below: Basically one step is an execution of my_executable1 followed by my_executable2. To simplify, let's call A the output of 1 and B the output of 2:

#!/bin/bash

#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00

module load <everything I need>

echo "Start of program at `date`"

for I in $(seq 10); do
  CMD="srun $HOME/project/bin/my_executable1 ../data/my_datafile_A_${I}.dat"
  echo "Launching command \"$CMD\" at $(date)"
  eval $CMD
  CMD="srun $HOME/project/bin/my_executable2 ../data/my_datafile_B_${I}.dat"
  echo "Launching command \"$CMD\" at $(date)"
  eval $CMD
done

echo "End of program at `date`"

If for some reason you really want to increment the index at each substep, you can use bc for the small computation:

#!/bin/bash

#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00

module load <everything I need>

echo "Start of program at `date`"

for I in $(seq 10); do
  INDEX=$(echo "2*$I-1" | bc)
  CMD="srun $HOME/project/bin/my_executable1 ../data/my_datafile_${INDEX}.dat"
  echo "Launching command \"$CMD\" at $(date)"
  eval $CMD
  INDEX=$(echo "2*$I" | bc)
  CMD="srun $HOME/project/bin/my_executable2 ../data/my_datafile_${INDEX}.dat"
  echo "Launching command \"$CMD\" at $(date)"
  eval $CMD
done

echo "End of program at `date`"