How to save/record SLURM script's config parameters to the output file?

I'm new to HPC and SLURM in particular. Here is an example code that I use to run my python script:


# Slurm submission script, serial job

#SBATCH --time 48:00:00
#SBATCH --mem 0
#SBATCH --mail-type ALL
#SBATCH --partition gpu_v100
#SBATCH --gres gpu:4
#SBATCH --nodes 4
#SBATCH --ntasks-per-node=4

#SBATCH --output R-%x.%j.out
#SBATCH --error R-%x.%j.err


module load python3-DL/torch/1.6.0-cuda10.1

srun python3 \
      --gpus 4 \
      --max_epochs 1024 \
      --batch_size 256 \
      --num_nodes 4 \
      --num_workers 8 \

Now everytime I run this script using sbatch it generates two .err and .out files that I can only encode the "" filename and Job ID into these two filenames. but how can I save a copy of all the parameters i set in the script above whether for the slurm configs or the python code arguments tied to the Job ID and the generated .out and .err files?

For example if i run the script above 4 times in a row but each time with a different parameters its not clear from those files which correspond to which unless i manually keep a track of the parameters and JOB IDs. there should be some way to automate this in SLURM no?


  • You add the following two lines at the end of your submission script:

    scontrol show job $SLURM_JOB_ID
    scontrol write batch_script $SLURM_JOB_ID -

    This will write the job description and the job submission script at the end of the .out file.