Search code examples
pythoncombinationspermutationbioinformaticspython-itertools

Generate all combinations of nucleotide k-mers between range(i, j)


I need to generate a list of all possible nucleotide combination of length between 5-15.

nucleotides = ['A', 'T', 'G', 'C']

Expected results:

AAAAA
AAAAT
AAAAC
AAAAG
AAATA
AAATT
...
AAAAAAAAAAAAAAA
AAAAAAAAAAAAAAT
etc.

I tried:

for i in range(5,16):
    for j in itertools.permutations(nucleotides, i):
        print j

But this doesn't work if len(nucleotides) < i.

Thanks in advance!


Solution

  • If you want to find all the combinations, you should use .product() as .permutations() will not produce repeated nucleotides like AAAAA or AATGC anyway. Try this:

    for i in range(5, 16):
        combinations = itertools.product(*itertools.repeat(nucleotides, i))
        for j in combinations:
            print(j)
    

    Update: As @JaredGoguen mentioned, the repeat argument can also be used here:

    combinations = itertools.product(nucleotides, repeat=i)