Search code examples
workflowsnakemake

Force a certain rule to execute at the end


My question is very similar to this one.

I am writing a snakemake pipeline, and it does a lot pre- and post-alignment quality control. At the end of the pipeline, I run multiQC on those QC results.

Basically, the workflow is: preprocessing -> fastqc -> alignment -> post-alignment QCs such as picard, qualimap, and preseq -> peak calling -> motif analysis -> multiQC.

MultiQC should generate a report on all those outputs as long as multiQC support them.

One way to force multiqc to run at the very end is to include all the output files from the above rules in the input directive of multiqc rule, as below:

rule a:
  input: "a.input"
  output: "a.output"
  
rule b:
  input: "b.input"
  output: "b.output"
  
rule c:
  input: "b.output"
  output: "c.output"
  
rule multiqc:
  input: "a.output", "c.output"
  output: "multiqc.output"

However, I want a more flexible way that doesn't depend on specific upstream output files. In such a way, when I change the pipelines (adding or removing any rules), I don't need to change the dependency for multiqc rule. The input to multiqc should simply be a directory containing all the files that I want multiqc to scan over.

In my situation, how can I force the multiQC rule to execute at the very end of pipeline? Or is there any general way that I can force a certain rule in snakemake to run as the last job? Probably through some configuration on smakemake such that in any situation, no matter how I change the pipeline, this rule will execute at the end. I am not sure whether or not such method exists.

Thanks very much for helping!


Solution

  • It seems like onsuccess handler in snakemake is what I am looking for.