I am trying to create a code to compare gene file with gene panels. The gene panel file is in csv format and has Chromosome, gene, start location and end locations. patients file has chromosome, mutations and the location. so i made a loop to pass gene panel information to a function where the comparison is done to return me a list of similar items. the function works great when i call it with manual data. but doenst not do the comparison inside the loop.
import vcf
import os, sys
records = open('exampleGenePanel.csv')
read = vcf.Reader(open('examplePatientFile.vcf','r'))
#functions to find mutations in patients sequence
def findMutations(gn,chromo,start,end):
start = int(start)
end = int(end)
for each in read:
CHROM = each.CHROM
if CHROM != chromo:
continue
POS = each.POS
if POS < start:
continue
if POS > end:
continue
REF = each.REF
ALT = each.ALT
print (gn,CHROM,POS,REF,ALT)
list.append([gn,CHROM,POS,REF,ALT])
return list
gene = records.readlines()
list=[]
y = len (gene)
x=1
while x < 3:
field = gene[x].split(',')
gname = field[0]
chromo = field[1]
gstart = field[2]
gend = field[3]
findMutations(gname,chromo,gstart,gend)
x = x+1
if not list:
print ('Mutation not found')
else:
print (len(list),' Mutations found')
print (list)
i want to get the details of matching mutations in the list. This works as expected when i pass the data manually to the function. Eg.findMutations('TESTGene','chr8','146171437','146229161') But doesnt compare when passed through the loop
The problem is that findMutations
attempts to read from read
each time it is called, but after the first call, read
has already been read and there's nothing left. I suggest reading the contents of read
once, before calling the function, then save the results in a list. Then findMutations
can read the list each time it is called.
It would also be a good idea to use a name other than list
for your result list, since that name conflicts with the Python built-in function. It would also be better to have findMutations
return its result list rather than append it to a global.