I'm trying to adapt the following lines of code for use with GNU parallel:
for ID in $(cut -f1 markers.tsv);
do echo $ID;
FAA=${ID}.faa.gz
zcat ${FAA} | muscle -out ${ID}.msa
done
Preferably without creating an intermediate script.
However, the examples I'm seeing here do not show where I can use my ${ID}
argument.
This could be one a one liner:
for ID in $(cut -f1 markers.tsv);
do echo $ID && FAA=${ID}.faa.gz && zcat ${FAA} | muscle -out ${ID}.msa
done
I'm trying this but it appears to not be running the jobs simultaneously:
cut -f1 markers.tsv | parallel -j 16 -I @ 'echo "@" && FAA="@.faa.gz" && zcat $FAA | muscle -out @.msa'
Can someone help me adapt this using 16 jobs correctly?
Example markers.tsv
PF00709.21\t1\ta
PF00406.22\t2\tb
PF01808.18\t3\tc
Due to a bug in GNU Parallel an input line cannot be longer that the maximal command line length.
cut -f1 markers.tsv |
parallel -j16 'echo {} && zcat {}.faa.gz | muscle -out {}.msa'