Search code examples
snakemake

snakemake - do not delete output of failed rules


I have a snakemake workflow containing a rule that runs another "inner" snakemake workflow.
Sometimes a certain rule of the inner workflow fails, which means the inner workflow fails. As a result, all files listed under the output of the inner workflow are deleted by the outer workflow, even if the rules of the inner workflow that created them completed successfully.
Is there a way to prevent snakemake from deleting the outputs of failed rules? Or maybe you can suggest another workaround?
A few notes:

  • The outputs of the inner workflow must be listed, b/c they are used as input for other rules in the outer workflow.
  • I tried setting the outputs of the inner workflow as protected, but this didn't help.
  • I've also tried adding exit 0 to the end of the call to the inner workflow to make snakemake think it completed successfully,

like this:

rule run_inner:
    input:
        inputs...
    output:
        outputs...
    shell:
        """
        snakemake -s inner.snakefile
        exit 0
        """

but outputs are still deleted.
Would appreciate any help. Thanks!


Solution

  • One option may be to have run_inner produce a dummy output file that flags the completion of the rule. Rules following run_inner will take in input the dummy file. E.g:

    rule run_inner:
        ...
        output:
            # or just 'run_inner.done' if wildcards are not involved
            touch('{sample}.run_inner.done'), 
        shell:
            'snakemake -s inner.snakefile'
    
    run next:
        input:
            '{sample}.run_inner.done',
        params:
            real_input= '{sample}.data.txt', # This is what run_inner actually produces
        shell:
            'do stuff {params.real_input}'
    

    If snakemake -s inner.snakefile fails, the dummy output will be deleted but snakemake -s inner.snakefile will restart from where it left.

    Another opition could be to integrate the rules in inner.snakefile into your outer pipeline using e.g. the include statement. I feel this option is preferable but, of course, it would be more complicated to implement.