I am writing a program which reads in a number of DNA characters (which is always divisible by 3) and checks if they correspond to the same amino acid. For example AAT and AAC both correspond to N so my program should print "It's the same". It does this fine but i just don't know how to compare 6/9/12/any multiple of 3 and see if the definitions are the same. For example:
AAAAACAAG
AAAAACAAA
Should return me It's the same as they are both KNK.
This is my code:
sequence = {}
d = 0
for line in open('codon_amino.txt'):
pattern, character = line.split()
sequence[pattern] = character
a = input('Enter original DNA: ')
b = input('Enter patient DNA: ')
for i in range(len(a)):
if sequence[a] == sequence[b]:
d = d + 0
else:
d = d + 1
if d == 0:
print('It\'s the same')
else:
print('Mutated!')
And the structure of my codon_amino.txt is structured like:
AAA K
AAC N
AAG K
AAT N
ACA T
ACC T
ACG T
ACT T
How do i compare the DNA structures in patters of 3? I have it working for strings which are 3 letters long but it returns an error for anything more.
EDIT:
If i knew how to split a and b into a list which was in intervals of three that might help so like:
a2 = a.split(SPLITINTOINTERVALSOFTHREE)
then i could easily use a for loop to iterate through them, but how do i split them in the first place?
EDIT: THE SOLUTION:
sequence = {}
d = 0
for line in open('codon_amino.txt'):
pattern, character = line.split()
sequence[pattern] = character
a = input('Enter original DNA: ')
b = input('Enter patient DNA: ')
for i in range(len(a)):
if all(sequence[a[i:i+3]] == sequence[b[i:i+3]] for i in range(0, len(a), 3)):
d = d + 1
else:
d = d + 0
if d == 0:
print('The patient\'s amino acid sequence is mutated.')
else:
print('The patient\'s amino acid sequence is not mutated.')
I think you can replace your second loop and comparisons with:
if all(sequence[a[i:i+3]] == sequence[b[i:i+3]] for i in range(0, len(a), 3)):
print('It\'s the same')
else:
print('Mutated!')
The all
function iterates over the generator expression, and will be False if any of the values is False. The generator expression compares length-three slices of the strings.