Search code examples
pythonregexstringdigits

Regex find ALL patterns between string


I want to match digits betwen "000" or betwen \b and "000" or "000" and \b from a string like this:

11101110001011101000000011101010111

I have tried with expressions like this:

(?<=000)\d+(?=000)

but I only get the largest occurrence

I expect to get:

1110111
1011101
0
11101010111

Solution

  • You can use the regex package and the .findall() method:

    In [1]: s = "11101110001011101000000011101010111"
    
    In [2]: import regex
    
    In [3]: regex.findall(r"(?<=000|^)\d+?(?=000|$)", s)
    Out[3]: ['1110111', '1011101', '0', '00011101010111']
    

    The 000|^ and 000|$ would help to match either the 000 or the beginning and the end of a string respectively. Also note the ? after the \d+ - we are making it non-greedy.

    Note that the regular re.findall() would fail with the following error in this case:

    error: look-behind requires fixed-width pattern

    This is because re does not support variable-length lookarounds but regex does.