Search code examples
pythondna-sequence

Need some help on a function


Write a function named one_frame that takes one argument seq and performs the tasks specified below. The argument seq is to be a string that contains information for the bases of a DNA sequence.

  • a → The function searches given DNA string from left to right in multiples of three nucleotides (in a single reading frame).
  • b → When it hits a start codon ATG it calls get_orf on the slice of the string beginning at that start codon.
  • c → The ORF returned by get_orf is added to a list of ORFs.
  • d → The function skips ahead in the DNA string to the point right after the ORF that we just found and starts looking for the next ORF.
  • e → Steps a through d are repeated until we have traversed the entire DNA string.

The function should return a list of all ORFs it has found.

def one_frame(seq):
    start_codon = 'ATG'
    list_of_codons = []
    y = 0
    while y < len(seq):
        subORF = seq[y:y + 3]
        if start_codon in subORF:
            list_of_codons.append(get_orf(seq))
            return list_of_codons
        else:
            y += 3

one_frame('ATGAGATGAACCATGGGGTAA')
  1. The one_frame at the very bottom is a test case. It is supposed to be equal to ['ATGAGA', 'ATGGGG'], however my code only returns the first item in the list.
  2. How could I fix my function to also return the other part of that list?

Solution

  • You have several problems:

    1. You have return list_of_codons inside the loop. So you return as soon as you find the first match and only return that one. Put that at the end of the function, not inside the loop.

    2. You have y += 3 in the else: block. So you won't increment y when you find a matching codon, and you'll be stuck in a loop.

    3. You need to call get_orf() on the slice of the string starting at y, not the whole string (task b).

    4. Task d says you have to skip to the point after the ORF that was returned in task b, not just continue at the next codon.

    def one_frame(seq):
        start_codon = 'ATG'
        list_of_orfs = []
        y = 0
        while y < len(seq):
            subORF = seq[y:y + 3]
            if start_codon = subORF:
                orf = get_orf(seq[y:])
                list_of_orfs.append(orf)
                y += len(orf)
            else:
                y += 3
    
        return list_of_orfs
    
    one_frame('ATGAGATGAACCATGGGGTAA')