Search code examples
pythonfunctionloopsvcf-variant-call-format

Python function doesnt work inside a loop


I am trying to create a code to compare gene file with gene panels. The gene panel file is in csv format and has Chromosome, gene, start location and end locations. patients file has chromosome, mutations and the location. so i made a loop to pass gene panel information to a function where the comparison is done to return me a list of similar items. the function works great when i call it with manual data. but doenst not do the comparison inside the loop.

import vcf
import os, sys

records = open('exampleGenePanel.csv')
read = vcf.Reader(open('examplePatientFile.vcf','r'))

#functions to find mutations in patients sequence
def findMutations(gn,chromo,start,end):
    start = int(start)
    end = int(end)
    for each in read:
        CHROM = each.CHROM
        if CHROM != chromo:
            continue
        POS = each.POS
        if POS < start:
            continue
        if POS > end:
            continue
        REF = each.REF
        ALT = each.ALT
        print (gn,CHROM,POS,REF,ALT)
        list.append([gn,CHROM,POS,REF,ALT])
    return list

gene = records.readlines()

list=[]
y = len (gene)
x=1
while x < 3:
    field = gene[x].split(',')
    gname = field[0]
    chromo = field[1]
    gstart = field[2]
    gend = field[3]
    findMutations(gname,chromo,gstart,gend)
    x = x+1
if not list:
    print ('Mutation not found')
else:
    print (len(list),' Mutations found')
    print (list)

i want to get the details of matching mutations in the list. This works as expected when i pass the data manually to the function. Eg.findMutations('TESTGene','chr8','146171437','146229161') But doesnt compare when passed through the loop


Solution

  • The problem is that findMutations attempts to read from read each time it is called, but after the first call, read has already been read and there's nothing left. I suggest reading the contents of read once, before calling the function, then save the results in a list. Then findMutations can read the list each time it is called.

    It would also be a good idea to use a name other than list for your result list, since that name conflicts with the Python built-in function. It would also be better to have findMutations return its result list rather than append it to a global.