Search code examples
workflowbioinformaticssnakemake

WildcardError - No values given for wildcard - Snakemake


I am really lost what exactly I can do to fix this error. I am running snakemake to perform some post alignment quality checks. My code looks like this:

SAMPLES = ["Exome_Tumor_sorted_mrkdup_bqsr", "Exome_Norm_sorted_mrkdup_bqsr",
           "WGS_Tumor_merged_sorted_mrkdup_bqsr", "WGS_Norm_merged_sorted_mrkdup_bqsr"]

rule all:
    input:
        expand("post-alignment-qc/flagstat/{sample}.txt", sample=SAMPLES),
        expand("post-alignment-qc/CollectInsertSizeMetics/{sample}.txt", sample=SAMPLES),
        expand("post-alignment-qc/CollectAlignmentSummaryMetrics/{sample}.txt", sample=SAMPLES),
        expand("post-alignment-qc/CollectGcBiasMetrics/{sample}_summary.txt", samples=SAMPLES) # this is the problem causing line

rule flagstat:
    input:
         bam = "align/{sample}.bam"
    output:
          "post-alignment-qc/flagstat/{sample}.txt"
    log:
       err='post-alignment-qc/logs/flagstat/{sample}_stderr.err'
    shell:
         "samtools flagstat {input} > {output} 2> {log.err}"


rule CollectInsertSizeMetics:
    input:
         bam = "align/{sample}.bam"
    output:
          txt="post-alignment-qc/CollectInsertSizeMetics/{sample}.txt",
          pdf="post-alignment-qc/CollectInsertSizeMetics/{sample}.pdf"
    log:
       err='post-alignment-qc/logs/CollectInsertSizeMetics/{sample}_stderr.err',
       out='post-alignment-qc/logs/CollectInsertSizeMetics/{sample}_stdout.txt'

    shell:
         "gatk CollectInsertSizeMetrics -I {input} -O {output.txt} -H {output.pdf} 2> {log.err}"

rule CollectAlignmentSummaryMetrics:
    input:
         bam = "align/{sample}.bam",
         genome= "references/genome/ref_genome.fa"
    output:
          txt="post-alignment-qc/CollectAlignmentSummaryMetrics/{sample}.txt",
    log:
       err='post-alignment-qc/logs/CollectAlignmentSummaryMetrics/{sample}_stderr.err',
       out='post-alignment-qc/logs/CollectAlignmentSummaryMetrics/{sample}_stdout.txt'

    shell:
         "gatk CollectAlignmentSummaryMetrics -I {input.bam} -O {output.txt} -R {input.genome} 2> {log.err}"

rule CollectGcBiasMetrics:
    input:
         bam = "align/{sample}.bam",
         genome= "references/genome/ref_genome.fa"

    output:
          txt="post-alignment-qc/CollectGcBiasMetrics/{sample}_metrics.txt",
          CHART="post-alignment-qc/CollectGcBiasMetrics/{sample}_metrics.pdf",
          S="post-alignment-qc/CollectGcBiasMetrics/{sample}_summary.txt"

    log:
       err='post-alignment-qc/logs/CollectGcBiasMetrics/{sample}_stderr.err',
       out='post-alignment-qc/logs/CollectGcBiasMetrics/{sample}_stdout.txt'

    shell:
         "gatk CollectGcBiasMetrics -I {input.bam} -O {output.txt} -R {input.genome} -CHART = {output.CHART} "
         "-S {output.S} 2> {log.err}"

The error message says the following:

WildcardError in line 9 of Snakefile:
No values given for wildcard 'sample'.
  File "Snakefile", line 9, in <module>

In my code above I have indicated the problem causing line. When I simply remove this line everything runs perfekt. I am really confused, because I pretty much copy and pasted each rule, and this is the only rule that causes any problems.

If someone could point out what I did wrong, I would be very thankful!

Cheers!


Solution

  • Seems like it could be a spelling mistake - in the highlighted line, you write samples=SAMPLES, but the wildcard is called {sample} without the "s".