Search code examples
linuxshellparallel-processinggnu

Chaining jobs in GNU parallel


I am trying to execute this command with a tool called TOOL.sh. My parallel input values are 0.01, 0.02 and 0.05.

This command alone is supposed to produce four output files for each --maf input. I only want to keep the fourth file which is the log file and remove three other files (.bim, .bed, .fam) immediately after producing them as shown below. How do I write this code below so it would work?

PLINKfile=file
DIRplink=$PWD
OUTFILE="PWD/test"
parallel -j3 TOOL.sh --bfile ${PLINKfile} --maf {} --out $DIRplink/${OUTFILE}_{} | 
rm "${DIRplink}/${OUTFLE}_{}.bed" && 
rm "${DIRplink}/${OUTFLE}_{}.bim" &&
rm "${DIRplink}/${OUTFLE}_{}.fam"
::: 0.01 0.02 0.05

Solution

  • The problem is that the outer shell parses | and &&, but you intended them to be part of the command that parallel executes. To do that, just enclose it in double quotes:

    parallel -j3 "TOOL.sh --bfile ${PLINKfile} --maf {} --out $DIRplink/${OUTFILE}_{};
                  rm ${DIRplink}/${OUTFILE}_{}.{bed,bim,fam}"
    

    Alternatively, just delete all the unwanted files in one go after parallel finishes:

    parallel -j3 TOOL.sh --bfile ${PLINKfile} --maf {} --out $DIRplink/${OUTFILE}_{}
    rm ${DIRplink}/${OUTFILE}_*.{bed,bim,fam}