I am novice to python and I was struggling to do this for last one week could someone help me out of this problem which would be very helpful to finish my project.
I tried to do single mutation and their 2,3 combinations based on the user input for given sequence:
INPUT SEQUENCE: >PEACCEL
USER MUTATION INPUT FILE:
E2R
C4W
E6G
#!/usr/bin/python
import getopt
import sys
import itertools as it
from itertools import groupby
def main(argv):
try:
opts,operands = getopt.getopt(sys.argv[1:],'i:m:o:'["INPUT_FILE:=","MUTATIONFILE:=","OUTPUT_FILE:=","help"])
if len(opts) == 0:
print "Please use the correct arguments, for usage type --help "
else:
for option,value in opts:
if option == "-i" or option == "--INPUT_FILE:":
seq=inputFile(value)
if option == "-m" or option == "--MUTATION_FILE:":
conA=MutationFile(value)
if option == "-o" or option == "--OUTPUT_FILE:":
out=outputName(value)
return seq,conA
except getopt.GetoptError,err:
print str(err)
print "Please use the correct arguments, for usage type --help"
def inputFile(value):
try:
fh = open(value,'r')
except IOError:
print "The file %s does not exist \n" % value
else:
ToSeperate= (x[1] for x in groupby(fh, lambda line: line[0] == ">"))
for header in ToSeperate:
header = header.next()[1:].strip()
Sequence = "".join(s.strip() for s in ToSeperate.next())
return Sequence
def MutationFile(value):
try:
fh=open(value,'r')
content=fh.read()
Rmcontent=str(content.rstrip())
except IOError:
print "The file %s does not exist \n" % MutFile
else:
con=list(Rmcontent)
return con
def Mutation(SEQUENCES,conA):
R=len(conA)
if R>1:
out=[]
SecondNum=1
ThirdChar=2
for index in range(len(conA)):
MR=conA[index]
if index==SecondNum:
SN=MR
SecondNum=SecondNum+4
if index==ThirdChar:
TC=MR
ThirdChar=ThirdChar+4
SecNum=int(SN.rstrip())
MutateResidue=str(TC.rstrip())
for index in range(len(SEQUENCES)):
if index==SecNum-1:
NonMutate=SEQUENCES[index]
AfterMutate=NonMutate.replace(NonMutate,MutateResidue)
new=SEQUENCES[ :index]+AfterMutate+SEQUENCES[index+1: ]
MutatedInformation= ['>',NonMutate,index+1,MutateResidue,'\n',new]
values2 = ''.join(str(i)for i in MutatedInformation)
if __name__ == "__main__":
seq,conA=main(sys.argv[1:])
Mutation(seq,conA)
This is my part of program where I replaced R,W,G of (2,4,6) to E,C,E then stored those replaced letter into variable called R which contain three lines like this:-
PrACCEL
PEAwCEL
PEACCgL
Now, I want to make 2 and 3 combination out of these three single mutations. It would be like Comb of two mutations in one line and three mutation in one line.
sample and expected output will be like this:
2C
PrAwCEL
PrACCgL
PEAwCgL
3C
PrAwCgL
Algorithm
his is part of my code so i will explain my algorithm
1.I read the mutation file which has three characters for eg (E2R) where (E)is amino acid letter which is (2) position of input sequence PEACCEL and third letter (R) is E2 going to be R.
2.So first I extracted positions and third variable from user mutation file and stored those into variables SecNum and MutateResidue(thirdchar).
3.then,I used for loop to read a sequence(PEACCEL) by index then whichever index match to SecNUm(E2,4,6) i replaced those sequence with those with Mutate Residue which is third character in mutation file (2R,4W,6G)
4.then finally I joined mutated residue index with other residue by this line:(new=SEQUENCES[:index]+AfterMutate+SEQUENCES[index+1: ]
Thanks in advance
from itertools import combinations,chain
from collections import Counter
def Mutation(SEQUENCES,conA):
#mutations=map(lambda x:x.strip(),open('a.txt','r').readlines())
mutation_combinations= chain.from_iterable([list(combinations(conA,i))for i in range(1,4)])
#[('E2R',), ('C4W',), ('E6G',), ('E2R', 'C4W'), ('E2R', 'E6G'), ('C4W', 'E6G'), ('E2R', 'C4W', 'E6G')]
for i in mutation_combinations:
print "combination :"+'_'.join(i)
c=Counter({})
temp_string=SEQUENCES
for j in i:
c[j[1]]=j[2].lower()
for index,letter in c.items():
temp_string=temp_string[:int(index)-1]+letter+temp_string[int(index):]
print temp_string
combination :E2R
PrACCEL
combination :C4W
PEAwCEL
combination :E6G
PEACCgL
combination :E2R_C4W
PrAwCEL
combination :E2R_E6G
PrACCgL
combination :C4W_E6G
PEAwCgL
combination :E2R_C4W_E6G
PrAwCgL
Algorithm i followed:
read the mutation sequences like E2R.... from a file using mutations=map(lambda x:x.strip(),open('a.txt','r').readlines())
made the combinations of the mutations mutation_combinations= chain.from_iterable([list(combinations(mutations,i))for i in range(1,4)])
if you have 4
mutations you want all four change the range value to 5
so for each combination i replaced them with specified character
for j in i:
c[j[1]]=j[2].lower()
i used above counter to keep track of which character to be replaced during mutation combination