I'm running a metagenomics pipeline in Snakemake. I am running MetaSPAdes for my assemblies, but it's not uncommon that MetaSPAdes will often fail for particular samples. If MetaSPAdes fails, I want to run MEGAHIT on only the samples that it failed for. Is there any way to create this sort of rule dependancy in Snakemake?
For example:
Has anyone figured out an elegant way to do something like this?
I'm not familiar with the programs you mention but I think you don't need separate rules for what you need. You can write a single rule that tries to run metaspades first and if it fails try megahit. For example:
rule assembly:
input:
'{sample}.in',
output:
'{sample}.out',
run:
import subprocess
p = subprocess.Popen("MetaSPAdes {input} > {output}", shell= True, stderr= subprocess.PIPE, stdout= subprocess.PIPE)
stdout, stderr= p.communicate()
if p.returncode != 0:
shell("megahit {input} > {output}")
stdout, stderr= p.communicate()
captures the stderr, stdout and return code of the process. You can analyse stderr and/or the returncode to decide what to do next. You probably need something more than the above but hopefully the idea is about right.