Search code examples
pythonregexpython-re

How to get all the indexes of leading zeroes using regex in python


Using Regex in Python (library re (only)), I want to create a function that gives me the position of all leading 0s in a string.

For example, if the string was: My house has 01 garden and 003 rooms. I would want me the function to return 13, 27 and 28.

I tried for example:

import re
string = "My house has 01 garden and 003 rooms."
pattern = "(0+)[1-9]\d*"

print(re.findall(pattern,string))

Obviously, the output gives me the matches but no position...


Solution

  • You can do the following:

    import re
    
    text = "My house has 01 garden and 003 rooms."
    pattern = re.compile(r"\b0+")
    
    
    def leading_zeros_index(s: str) -> list:
        return [i for m in pattern.finditer(s) for i in range(m.start(), m.end())]
    
    
    print(leading_zeros_index(text))
    

    output:

    [13, 27, 28]
    

    Basically you use .finditer() in order to get the match object, then you create a range() object from match object's .start() and .end().

    I used \b0+ as the pattern. There is no need to check the other characters come after zeros. \b is word boundary, here means, zeros should be at the start of the words.