I guess this is a recurring issue after snakemake v5.1 Saw few threads, but couldn't able to figure out what's going wrong. Help, please.
rule all:
input:
[OUT_DIR + "/" + x for x in expand('{sample}', sample = SAMPLES)]
rule star_mapping:
input:
dna= DNA,
r1 = lambda wildcards: FILES[wildcards.sample]['R1'],
r2 = lambda wildcards: FILES[wildcards.sample]['R2'],
output:
directory(join(OUT_DIR, '{sample}'))
log:
'logs/{sample}_star_mapping.log'
run:
shell(
'STAR --runMode alignReads'
' --runThreadN 10'
' --genomeDir {input.dna}'
' --genomeLoad LoadAndKeep'
' --limitBAMsortRAM 8000000000'
' --readFilesIn {input.r1} {input.r2}'
' --outSAMtype BAM SortedByCoordinate'
' --quantMode GeneCounts'
' --readFilesCommand zcat'
' --outFileNamePrefix {output}_'
' &> {log}'
)
Error: MissingOutputException in rule star_mapping.
I don't think you understood correctly the directory()
function.
By specifying directory(join(OUT_DIR, '{sample}'))
, snakemake is expecting the OUTDIR/{sample}/
directory to be created.
However, the rule star_mapping
will create the following files:
OUTDIR/{sample}_Aligned.sortedByCoord.out.bam
OUTDIR/{sample}_Log.final.out
OUTDIR/{sample}_Log.out
etc...
According to
https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#directories-as-outputs
directory()
should be used for timestamp lookup issus. The doc states:
Always consider if you can’t formulate your workflow using normal files before resorting to using directory().
So I guess you should consider defining file like:
"OUTDIR/{sample}_Aligned.sortedByCoord.out.bam"
as output of your star rule and
--outFileNamePrefix {OUTDIR}/{wildcards.sample}_
for the STAR option