Search code examples
pythonsnakemakeqiime

Is it possible to add a conditional statement in snakemake's rule all?


I want to run multiple snakefiles called qc.smk , dada2.smk, picrust2.smkusing singularity. Then there is one snakefile called longitudinal.smk I would like to run conditionally. For example, if longitudinal data is being used.

# set vars

LONGITUDINAL = config['perform_longitudinal']

rule all:
  input:
    # fastqc output before trimming
    raw_html = expand("{scratch}/fastqc/{sample}_{num}_fastqc.html", scratch = SCRATCH, sample=SAMPLE_SET, num=SET_NUMS),
    raw_zip = expand("{scratch}/fastqc/{sample}_{num}_fastqc.zip", scratch = SCRATCH, sample=SAMPLE_SET, num=SET_NUMS),
    raw_multi_html = SCRATCH + "/fastqc/raw_multiqc.html",
    raw_multi_stats = SCRATCH + "/fastqc/raw_multiqc_general_stats.txt"

# there are many more files in rule all

##### setup singularity #####

singularity: "docker://continuumio/miniconda3"

##### load rules #####

include: "rules/qc.smk"
include: "rules/dada2.smk"
include: "rules/phylogeny.smk"
include: "rules/picrust2.smk"

if LONGITUDINAL == 'yes':
    include: 'rules/longitudinal.smk'
    print("Will perform a longitudinal analysis")
else:
    print("no longitudinal analysis")

The code above works only if I am running a longitudinal dataset. However, when I am not running the longitudinal analysis snakemake fails and says something like:

MissingInputException in line 70 of /mnt/c/Users/noahs/projects/tagseq-qiime2-snakemake-1/Snakefile:
Missing input files for rule all:

I think if I was able to add a similar conditional statement like the one I have for my external snakefile snakemake would not freak out about me not including the longitudinal snakefile.


Solution

  • You can define a list (or dict) of what you want as output outside of the rule all, and feed that to the input, something like this works:

    myoutput = list()
    
    if condition_1 == True:
        myoutput.append("file_1.txt")
    if condition_2 == True:
        myoutput.append("file_2.txt")
    
    rule all:
        input:
            myoutput
    

    edit:

    Either place myoutput as first in the input of rule all:

    rule all:
        input:
            myoutput,
            raw_html = "raw_html_path",
            raw_zip = "raw_zip_path"
    

    or make it named, and place it wherever:

    rule all:
        input:
            raw_html = "raw_html_path",
            myoutput = myoutput,
            raw_zip = "raw_zip_path"
    

    In Python (and snakemake) named positional arguments always go before named arguments.