Search code examples

Convert a letter in string to different letters with multiple output

So I have a DNA sequence


where N = ["A", "G", "C", "T"]

I want to have all possible output of TAAAAT, TAAAGT, TAAACT, TAAATT..... and so on.

Right now from online I found solution of permutations where I can do perms = [''.join(p) for p in permutations(N, 3)] then just iterate my DNA sequence as TA + perms + T

but I wonder if there is easier way to do this, because I have a lot more DNA sequences and make take a lot more time to hard code it.


The hard code part will be as in I would have to state

N1 = [''.join(p) for p in permutations(N, 1)]
N2 = [''.join(p) for p in permutations(N, 2)]
N3 = [''.join(p) for p in permutations(N, 3)]

then do for i in N3:

key = "TA" + N3[i] + "T"

Since my sequence is quite long, I don't want count how many consecutive N I have in the sequence and want to see if there is better way to do this.


  • You can use your permutation results to format a string like:


    import itertools as it
    import re
    def convert_sequence(base_string, target_letter, perms):
        REGEX = re.compile('(%s+)' % target_letter)
        match =
        pattern = REGEX.sub('%s', base_string)
        return [pattern % ''.join(p) for p in it.permutations(perms, len(match))]

    Test Code:

    print(convert_sequence('TANNNT', 'N', ['A', 'G', 'C', 'T']))
