Search code examples
snakemake

How to fix 'MissingOutputException'?


I guess this is a recurring issue after snakemake v5.1 Saw few threads, but couldn't able to figure out what's going wrong. Help, please.

rule all:
input:
    [OUT_DIR + "/" + x for x in expand('{sample}', sample = SAMPLES)]

rule star_mapping:
input:
    dna= DNA,
    r1 = lambda wildcards: FILES[wildcards.sample]['R1'],
    r2 = lambda wildcards: FILES[wildcards.sample]['R2'],
output:
    directory(join(OUT_DIR, '{sample}'))
log:
    'logs/{sample}_star_mapping.log'
run:
    shell(
    'STAR --runMode alignReads'
        ' --runThreadN 10'
        ' --genomeDir {input.dna}'
        ' --genomeLoad LoadAndKeep'
        ' --limitBAMsortRAM 8000000000'
        ' --readFilesIn {input.r1} {input.r2}'
        ' --outSAMtype BAM SortedByCoordinate'
        ' --quantMode GeneCounts'
        ' --readFilesCommand zcat'
        ' --outFileNamePrefix {output}_'
        ' &> {log}'
    )

Error: MissingOutputException in rule star_mapping.


Solution

  • I don't think you understood correctly the directory() function.
    By specifying directory(join(OUT_DIR, '{sample}')), snakemake is expecting the OUTDIR/{sample}/ directory to be created.
    However, the rule star_mapping will create the following files:

    OUTDIR/{sample}_Aligned.sortedByCoord.out.bam
    OUTDIR/{sample}_Log.final.out
    OUTDIR/{sample}_Log.out
    etc...
    

    According to
    https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#directories-as-outputs
    directory() should be used for timestamp lookup issus. The doc states:

    Always consider if you can’t formulate your workflow using normal files before resorting to using directory().

    So I guess you should consider defining file like:
    "OUTDIR/{sample}_Aligned.sortedByCoord.out.bam" as output of your star rule and
    --outFileNamePrefix {OUTDIR}/{wildcards.sample}_ for the STAR option