Search code examples
pythondna-sequence

Generating random sequences of DNA


I am trying to generate random sequences of DNA in python using random numbers and random strings. But I am getting only one string as my output. For example: If I give DNA of length 5 (String(5)), I should get an output "CTGAT". Similarly if I give String(4) it should give me "CTGT". But I am getting "G" or "C" or "T" or "A"; i.e. only a single string each time. Could anyone please help me with this??

I tried the following code:

from random import choice
def String(length):

   DNA=""
   for count in range(length):
      DNA+=choice("CGTA")
      return DNA

Solution

  • You return too quickly:

    from random import choice
    def String(length):
    
       DNA=""
       for count in range(length):
          DNA+=choice("CGTA")
          return DNA
    

    If your return statement is inside the for loop, you will only iterate once --- you will exit out of the function with the return.

    From the Python Documentation on return statements: "return leaves the current function call with the expression list (or None) as return value."

    So, put the return at the end of your function:

    def String(length):
    
           DNA=""
           for count in range(length):
              DNA+=choice("CGTA")
           return DNA
    

    EDIT: Here's a weighted choice method (it will only work for strings currently, since it uses string repetition).

    def weightedchoice(items): # this doesn't require the numbers to add up to 100
        return choice("".join(x * y for x, y in items))
    

    Then, you want to call weightedchoice instead of choice in your loop:

    DNA+=weightedchoice([("C", 10], ("G", 20), ("A", 40"), ("T", 30)])