I am trying to extract positions and SNPs from a VCF file. I have written the following so far. But how can I change the name of the dictionary so that I end up with one dictionary for each input file?
i.e.: python vcf_compare.py file1.vcf file2.vcf file3.vcf
import sys
import vcf
for variants in sys.argv[1:]:
file1 = {}
vcf_reader = vcf.Reader(open(variants))
for record in vcf_reader:
pos = record.POS
alt = record.ALT
ref= record.REF
snps[pos]=ref,alt
so for argv[1] a dictionary called file1 is created. How can I make the dictionary change name to e.g. file two for the second iteration of the loop?
Short answer: you can't. This is an incredibly frustrating fact to many early programmers. The fix: another dictionary! outside of your variants
for loop, create another dictionary and use the filename as a key. Example (you can't just copy paste this, because I don't know how to use the vcf library):
import sys
import vcf
all_files = {}
for variants in sys.argv[1:]:
#didn't see file1 used, and didn't see snps created
#so figured file1 was snps...
snps = {}
vcf_reader = vcf.Reader(open(variants))
for record in vcf_reader:
pos = record.POS
alt = record.ALT
ref= record.REF
snps[pos]=ref,alt
all_files[variants] = snps
I'm assuming here that variants
is a filename in the form of a string. If not, replace the variants
in all_files[variants]
with the string you want to use as its key.