Search code examples
pythonregexstringoverlapping

Regex: counting the amount of times substring occurs in string including overlapping occurrences


I'm doing a problem on rosalind that wants you to return the positions that a substring occurs in a longer string. The only problem is there is an overlapping occurrence and the output should be: 1, 3, 9 (assuming 0 based counting) but I'm only getting 1 and 9? Here's my code.

import re

s='GATATATGCATATACTT'
t='ATAT'

substrings=re.compile('ATAT')
matches=substrings.finditer(s)

for match in matches:
     print(match.start()+1)  #doesn't find overlapping ones

Any help would be appreciated, thanks!


Solution

  • If you can install a third-party module, the regex module has an extended version of the re module API that allows an overlapped=True argument to be passed to findall and finditer.

    https://pypi.python.org/pypi/regex

    Otherwise, you might be able to adapt this answer.