Search code examples
bashfunctionparallel-processingslurm

Parallel function call in slurm sbatch script


I want to unzip big files, so I made a sbatch script.

My script defines and uses a function for unzipping a file, and I want the job to make multiple call of this function in parallel.

This is my script:

#!/bin/bash
#SBATCH --job-name=gunzip_hs2
#SBATCH --output %x-%j.out
#SBATCH --partition=lirmm
#SBATCH --account=atgc
#SBATCH --time=23:59:00
#SBATCH -n 4

gzipkeep() {
    filename=$(basename $1)
    gunzip -c -- "$1" > "uncompressed_fastq_files_H_Sapiens2/${filename::-3}"
}

export -f gzipkeep

mkdir -p "uncompressed_fastq_files_H_Sapiens2"

for file in /home/bunelp/scratch/jobs/fastq_files_H_Sapiens2/*.gz
do
    srun -n1 --exclusive gzipkeep $file &
done

wait

The problem is that srun doesn't know the function gzipkeep defined above, and I get the following error:

slurmstepd: error: execve(): gzipkeep: No such file or directory

I guess I'm missing something because I didn't find anyone with my problem, but I don't know what I am doing wrong. My function is very simple so I could find a workaround without using it, but I want to understand how I could do if I had a much more complex function.


Solution

  • SO thread provided in the comment by @Niloct, will give you the workaround to utilise the function in your srun script. Nevertheless, why the function (as argument in srun fn_name) is not working as straightforward as a command (for eg: srun hostname) can be explained as follows.

    consider the following call:

    srun -n N executable args
    

    srun uses the path resolution to locate the executable provided in the command. The executable is resolved as follows (See this link):

    1. If executable starts with ".", then path is constructed as: current working directory / executable
    2. If executable starts with a "/", then path is considered absolute.
    3. If executable can be resolved through PATH. See path_resolution(7).
    4. If executable is in current working directory.

    A corresponding file name matching the provided function name is not found using path resolution by srun and hence results in the error No such file or directory.