Search code examples
cluster-computingwildcardsnakemake

Snakemake on cluster error: 'Wildcards' object has no attribute 'output'


I'm running into an error of 'Wildcards' object has no attribute 'output', similar to this earlier question 'Wildcards' object has no attribute 'output', when I submit Snakemake to my cluster. I'm wondering if you have any suggestions for how to make this compatible with the cluster?

While my rule annotate_snps works when I test it locally, I get the following error on the cluster:

    input: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk.vcf.gz
    output: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_rename.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_tmp.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_ann.vcf.gz
    log: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_annotate_snps.log
    jobid: 1139
    wildcards: samp=CI226380_S4, mapper=bwa, ref=H37Rv

WorkflowError in line 173 of /oak/stanford/scg/lab_jandr/walter/tb/mtb/workflow/Snakefile:
'Wildcards' object has no attribute 'output'

My rule is defined as:

rule annotate_snps: 
  input:
    vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk.vcf.gz'
  log: 
    'results/{samp}/vars/{samp}_{mapper}_{ref}_annotate_snps.log'
  output:
    rename_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_rename.vcf.gz'),
    tmp_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_tmp.vcf.gz'),
    ann_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_ann.vcf.gz'
  params: 
    bed=config['bed_path'],
    vcf_header=config['vcf_header']
  shell:
    '''
    # Rename Chromosome to be consistent with snpEff/Ensembl genomes.
    zcat {input.vcf}| sed 's/NC_000962.3/Chromosome/g' | bgzip > {output.rename_vcf}
    tabix {output.rename_vcf}

    # Run snpEff
    java -jar -Xmx8g {config[snpeff]} eff {config[snpeff_db]} {output.rename_vcf} -dataDir {config[snpeff_datapath]} -noStats -no-downstream -no-upstream -canon > {output.tmp_vcf}

    # Also use bed file to annotate vcf
    bcftools annotate -a {params.bed} -h {params.vcf_header} -c CHROM,FROM,TO,FORMAT/PPE {output.tmp_vcf} > {output.ann_vcf}

    '''

Thank you so much in advance!


Solution

  • The raw rule definition appears to be consistent except for the multiple calls to the contents of config, e.g. config[snpeff].

    One thing to check is if the config definition on the single machine and on the cluster is the same, if it's not there might be some content that is confusing snakemake, e.g. if somehow config[snpeff] == "wildcards.output" (or something similar).