Search code examples
pythondictionarypython-3.xdna-sequence

Comparing strings to a dictionary in groups of multiples of 3


I am writing a program which reads in a number of DNA characters (which is always divisible by 3) and checks if they correspond to the same amino acid. For example AAT and AAC both correspond to N so my program should print "It's the same". It does this fine but i just don't know how to compare 6/9/12/any multiple of 3 and see if the definitions are the same. For example:

AAAAACAAG
AAAAACAAA 

Should return me It's the same as they are both KNK.

This is my code:

sequence = {}
d = 0
for line in open('codon_amino.txt'):
  pattern, character = line.split()
  sequence[pattern] = character
a = input('Enter original DNA: ')
b = input('Enter patient DNA: ')
for i in range(len(a)):
  if sequence[a] == sequence[b]:
    d = d + 0
  else:
    d = d + 1
if d == 0:
  print('It\'s the same')
else:
  print('Mutated!')

And the structure of my codon_amino.txt is structured like:

AAA K
AAC N
AAG K
AAT N
ACA T
ACC T
ACG T
ACT T

How do i compare the DNA structures in patters of 3? I have it working for strings which are 3 letters long but it returns an error for anything more.

EDIT:

If i knew how to split a and b into a list which was in intervals of three that might help so like:

a2 = a.split(SPLITINTOINTERVALSOFTHREE)

then i could easily use a for loop to iterate through them, but how do i split them in the first place?

EDIT: THE SOLUTION:

sequence = {}
d = 0
for line in open('codon_amino.txt'):
  pattern, character = line.split()
  sequence[pattern] = character
a = input('Enter original DNA: ')
b = input('Enter patient DNA: ')
for i in range(len(a)):
  if all(sequence[a[i:i+3]] == sequence[b[i:i+3]] for i in range(0, len(a), 3)):
    d = d + 1
  else:
    d = d + 0
if d == 0:
  print('The patient\'s amino acid sequence is mutated.')
else:
  print('The patient\'s amino acid sequence is not mutated.')

Solution

  • I think you can replace your second loop and comparisons with:

    if all(sequence[a[i:i+3]] == sequence[b[i:i+3]] for i in range(0, len(a), 3)):
        print('It\'s the same')
    else:
        print('Mutated!')
    

    The all function iterates over the generator expression, and will be False if any of the values is False. The generator expression compares length-three slices of the strings.