I'm running into an error of 'Wildcards' object has no attribute 'output', similar to this earlier question 'Wildcards' object has no attribute 'output', when I submit Snakemake to my cluster. I'm wondering if you have any suggestions for how to make this compatible with the cluster?
While my rule annotate_snps works when I test it locally, I get the following error on the cluster:
input: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk.vcf.gz
output: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_rename.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_tmp.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_ann.vcf.gz
log: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_annotate_snps.log
jobid: 1139
wildcards: samp=CI226380_S4, mapper=bwa, ref=H37Rv
WorkflowError in line 173 of /oak/stanford/scg/lab_jandr/walter/tb/mtb/workflow/Snakefile:
'Wildcards' object has no attribute 'output'
My rule is defined as:
rule annotate_snps:
input:
vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk.vcf.gz'
log:
'results/{samp}/vars/{samp}_{mapper}_{ref}_annotate_snps.log'
output:
rename_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_rename.vcf.gz'),
tmp_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_tmp.vcf.gz'),
ann_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_ann.vcf.gz'
params:
bed=config['bed_path'],
vcf_header=config['vcf_header']
shell:
'''
# Rename Chromosome to be consistent with snpEff/Ensembl genomes.
zcat {input.vcf}| sed 's/NC_000962.3/Chromosome/g' | bgzip > {output.rename_vcf}
tabix {output.rename_vcf}
# Run snpEff
java -jar -Xmx8g {config[snpeff]} eff {config[snpeff_db]} {output.rename_vcf} -dataDir {config[snpeff_datapath]} -noStats -no-downstream -no-upstream -canon > {output.tmp_vcf}
# Also use bed file to annotate vcf
bcftools annotate -a {params.bed} -h {params.vcf_header} -c CHROM,FROM,TO,FORMAT/PPE {output.tmp_vcf} > {output.ann_vcf}
'''
Thank you so much in advance!
The raw rule definition appears to be consistent except for the multiple calls to the contents of config
, e.g. config[snpeff]
.
One thing to check is if the config definition on the single machine and on the cluster is the same, if it's not there might be some content that is confusing snakemake
, e.g. if somehow config[snpeff] == "wildcards.output"
(or something similar).