I'm trying to optimize a study I'm doing. I currently have to job scripts I call them step1 and step2. In step1
#!/bin/bash
#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00
module load <everything I need>
echo "Start of program at `date`"
srun $HOME/project/bin/my_executable1 ../data/my_datafile0.dat
echo "End of program at `date`"
After this job is done I have a new datafile that we can call my_datafile1.dat and this goes into the the second job script step2:
#!/bin/bash
#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00
module load <everything I need>
echo "Start of program at `date`"
srun $HOME/project/bin/my_executable1 ../data/my_datafile1.dat
echo "End of program at `date`"
After this job I have a new datafile called my_datafile2.dat which I use in step1 again and then the new one in step2 etc. I'm wondering if there is a way to write a job script which does this iteration for me. I would like to tell it to do 20 iterations and then I'll end up with my_datafile1.dat, my_datafile2.dat, ..., my_datafile20.dat.
In a single job? If so, you could just use a loop, like follows (with a bit of verbosity in the inner loop, but can be replace by a single line if wanted).
Edit after the clarification in the comments below: Basically one step is an execution of my_executable1
followed by my_executable2
. To simplify, let's call A
the output of 1
and B
the output of 2
:
#!/bin/bash
#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00
module load <everything I need>
echo "Start of program at `date`"
for I in $(seq 10); do
CMD="srun $HOME/project/bin/my_executable1 ../data/my_datafile_A_${I}.dat"
echo "Launching command \"$CMD\" at $(date)"
eval $CMD
CMD="srun $HOME/project/bin/my_executable2 ../data/my_datafile_B_${I}.dat"
echo "Launching command \"$CMD\" at $(date)"
eval $CMD
done
echo "End of program at `date`"
If for some reason you really want to increment the index at each substep, you can use bc
for the small computation:
#!/bin/bash
#SBATCH --output=slurm-%j.out
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=28
#SBATCH --time=24:00:00
module load <everything I need>
echo "Start of program at `date`"
for I in $(seq 10); do
INDEX=$(echo "2*$I-1" | bc)
CMD="srun $HOME/project/bin/my_executable1 ../data/my_datafile_${INDEX}.dat"
echo "Launching command \"$CMD\" at $(date)"
eval $CMD
INDEX=$(echo "2*$I" | bc)
CMD="srun $HOME/project/bin/my_executable2 ../data/my_datafile_${INDEX}.dat"
echo "Launching command \"$CMD\" at $(date)"
eval $CMD
done
echo "End of program at `date`"