I want to write a Snakemake-Pipeline to process either short or long read sequencing files or both types, depending on which type of files is provided in the input file. First my Snakefile calls a shell script that creates a config file with the name of all short read files in the input directory under the heading short_reads and all long read files under the heading long_reads. This is followed by my all rule:
rule all:
input:
expand("../qc/id/{sample}/fastqc_raw/{sample}_R1_fastqc.html", sample=config["samples_short"]),
expand("../qc/id/{sample}/nanoplot_raw/NanoPlot-report.html", sample=config["samples_long"])
...
However, if one of the file types (long or short reads) is not provided, Snakemake fails with a KeyError. If I modify the config file in a way that the heading is still there but no sample names, Snakemake tries to call the input with the value None, e.g.
Missing input files for rule nanoplot_raw: ../raw_reads/None_ont.fastq.gz
How can I design the rule-all in a way, that it can handle either only short or long reads as well as both sequence types as Input?
Thanks for your help!
Does the following work?
if config["samples_short"]:
fastqc_short = expand("../qc/id/{sample}/fastqc_raw/{sample}_R1_fastqc.html", sample=config["samples_short"])
else:
fastqc_short = []
if config["samples_long"]:
nanoplot_long = expand("../qc/id/{sample}/nanoplot_raw/NanoPlot-report.html", sample=config["samples_long"])
else:
nanoplot_long = []
rule all:
input:
fastqc_short,
nanoplot_long,
...