Search code examples
wildcardbioinformaticssnakemake

SNAKEMAKE - Missing input files for rule all in snakemake because of wildcard_constraints


I'm creating a pipeline to run RDtest for each chromosome. However,with chromosome X and Y I want to do it separately with different options, therefore I used wildcard_constraints of snakemake. But snakemake gives me the Missing input files error. Could someone help me to fix it?

rule all:
    input:
        expand(rules.RdTest_autosomes.output,source = ALGO,chrom = CONTIG_LIST),

rule RdTest_autosomes:
    input:
        bed = OUTPUT_DIR + "/GenerateBatchMetrics/All_Beds/SEP/" + BATCH + '.{chrom}.{source}.bed',
        bincov = rules.ZPaste.output.matrix_file,
        medmat = rules.T_CalcMedCov.output,
    output:
        metrics=OUTPUT_DIR + '/GenerateBatchMetrics/Metrics/' + BATCH + '.{source}.{chrom}.metrics',
    wildcard_constraints:
        chrom='(' + '|'.join(AUTOSOMAL) + ')'   # <- from chr1 to chr22
    params:
        prefix = BATCH + '.{source}.{chrom}',
        famfile = config['base']['fam_file'],
        sample = OUTPUT_DIR + "/sample_list.txt",
        op_dir = OUTPUT_DIR + '/GenerateBatchMetrics/Metrics/',
    singularity:
        "sif_images/rdtest.sif"
    threads: workflow.cores * 0.4
    shell:
        """
        Rscript src/RdTest/RdTest.R \\
            -b {input.bed} \\
            -n {params.prefix} \\
            -o {params.op_dir} \\
            -c {input.bincov} \\
            -m {input.medmat} \\
            -f {params.famfile} \\
            -w {params.sample} 
        
        touch {output}
        """

Solution

  • With your wildcard_constraints the rule can process only the AUTOSOMAL contigs and consequently you get missing file errors for X and Y. You could write a rule RdTest_sexchroms similar to RdTest_autosomes but constrained to X and Y and with the parameters appropriate to the sex chromosomes. However, I think it would be better to remove the constraints altogether and use parameters dependent on the value of {chrom}. Somehting like:

    rule RdTest_autosomes:
        input:
            ...
        output:
            ...
        params:
            arg1=lambda wc: 'some-value' if wc.chrom in AUTOSOMAL \
                 else 'another-value' if wc.chrom in ['X', 'Y'] \
                 else None, 
            ... other params
        shell:
            """
            Rscript src/RdTest/RdTest.R \\
                -p {params.arg1} \\
                ...
            """