Search code examples
pythonregexlookbehind

Regular expression with positive lookbehind at beginning fails to match whole string


I am using Python with the re module and trying to match strings like decimal(4,1) and decimal(10,5), while only actually returning the 4,1 and 10,5, with the following regular expression:

(?<=decimal\()\d+,\d+(?=\)$)

Let's say I compile the regex with re.compile and name it DECIMAL. If I try to search decimal(4,1) for instances of the regex like so:

DECIMAL = re.compile(r'(?<=decimal\()\d+,\d+(?=\)$)')
results = DECIMAL.search('decimal(4,1)')

results.group(0) returns the string 4,1 as desired. However, if I try to match rather than search:

results = DECIMAL.match('decimal(4,1)')

results evaluates to None.

Does the match method fail here because match looks to fully match the consuming part of the regex against the beginning of the haystack and thus doesn't have any room for a preceding positive-length pattern to confirm?

As for the immediately practical, simply searching won't work in this case, since DECIMAL would turn up results in unacceptable strings like snarfdecimal(4,1). Should I be dropping in a beginning-of-string token somewhere, or is there something else I'm missing entirely?


Solution

  • You really don't need to use a positive look-behind at all,

    >>> import re
    >>> find_decimal = re.compile(r'decimal\((\d+,\d+)\)')
    >>> find_decimal.match('decimal(4,1)').group(1)
    '4,1'
    

    As for the reason it doesn't work, not sure but I'd guess you are correct in your thinking.