Search code examples
pythonregexfindall

Why using re.I flag in findall() function gives different results when function is used on the compiled regex object vs function used by itself


When the regex object is first compiled and findall() method is used on the compiled regex object with re.I flag the result differs from using findall() function with the same flag by itself.

(Removing re.I flag in the first example "fixes" the dissimilarity of results.)

import re

emails1 = re.compile(r"([A-z0-9._+-]+@[A-z0-9._+-]+\.[A-z]{2,})")
result = emails1.findall("[email protected]", re.I)
print(result)
>>>['[email protected]']

emails2 = re.findall(r"([A-z0-9._+-]+@[A-z0-9._+-]+\.[A-z]{2,})", "[email protected]", re.I)
print(emails2)
>>>['[email protected]']

Appreciate your help!


Solution

  • The findall method on a compiled regex has a different parameter signature than the findall function.

    Function

    findall(pattern, string, flags=0)
    

    Method

    findall(string, pos=0, endpos=9223372036854775807)
    

    re.I is an enumeration with a value of 2, so you are really asking the compiled method to start at position 2, ignoring that first "xx".