Search code examples
pythonstring

How to find all occurrences of a substring?


Python has string.find() and string.rfind() to get the index of a substring in a string.

I'm wondering whether there is something like string.find_all() which can return all found indexes (not only the first from the beginning or the first from the end).

For example:

string = "test test test test"

print string.find('test') # 0
print string.rfind('test') # 15

#this is the goal
print string.find_all('test') # [0,5,10,15]

For counting the occurrences, see Count number of occurrences of a substring in a string.


Solution

  • There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:

    import re
    [m.start() for m in re.finditer('test', 'test test test test')]
    #[0, 5, 10, 15]
    

    If you want to find overlapping matches, lookahead will do that:

    [m.start() for m in re.finditer('(?=tt)', 'ttt')]
    #[0, 1]
    

    If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:

    search = 'tt'
    [m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
    #[1]
    

    re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.