I want to unzip big files, so I made a sbatch script.
My script defines and uses a function for unzipping a file, and I want the job to make multiple call of this function in parallel.
This is my script:
#!/bin/bash
#SBATCH --job-name=gunzip_hs2
#SBATCH --output %x-%j.out
#SBATCH --partition=lirmm
#SBATCH --account=atgc
#SBATCH --time=23:59:00
#SBATCH -n 4
gzipkeep() {
filename=$(basename $1)
gunzip -c -- "$1" > "uncompressed_fastq_files_H_Sapiens2/${filename::-3}"
}
export -f gzipkeep
mkdir -p "uncompressed_fastq_files_H_Sapiens2"
for file in /home/bunelp/scratch/jobs/fastq_files_H_Sapiens2/*.gz
do
srun -n1 --exclusive gzipkeep $file &
done
wait
The problem is that srun
doesn't know the function gzipkeep
defined above, and I get the following error:
slurmstepd: error: execve(): gzipkeep: No such file or directory
I guess I'm missing something because I didn't find anyone with my problem, but I don't know what I am doing wrong. My function is very simple so I could find a workaround without using it, but I want to understand how I could do if I had a much more complex function.
SO thread provided in the comment by @Niloct, will give you the workaround to utilise the function in your srun script. Nevertheless, why the function (as argument in srun fn_name
) is not working as straightforward as a command (for eg: srun hostname
) can be explained as follows.
consider the following call:
srun -n N executable args
srun
uses the path resolution to locate the executable
provided in the command. The executable is resolved as follows (See this link):
A corresponding file name matching the provided function name is not found using path resolution by srun
and hence results in the error No such file or directory
.