Search code examples
phpregexpcre

why the name capture group does not capture the same value?


^(?'a'1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.(?&a)$

im learning regex and came across this problem where it does not capture 255.255 but 255.25

what's wrong with my regex ?

it works if I reuse the same pattern

^(?:1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.(?:1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])$

but it does not work when i try to use the name capture group (?&a)


Solution

  • It depends on the PCRE version. From PCRE news 10.30:

    The new implementation allows backtracking into recursive group calls in patterns, making it more compatible with Perl, and also fixes some other previously hard-to-do issues.

    • Before PCRE v10.30: Originally, a recursive group call was atomic by default.

      The order of your alternation is the pitfall because the first successful alternative wins. In your case 1?[0-9]?[0-9] matches 25 (other alternatives are never tested), then when the regex engine tries $ and fails, backtracking isn't possible in the group. You can solve the problem writing your named capture like that:

      (?<a>1[0-9]{0,2}|[3-9][0-9]?|2(?:[0-4][0-9]?|5[0-5]?|[6-9])?|0)
      

      it's a bit longer but each number follows a unique path to succeed: demo

    • Since PCRE 10.30: In newer PCRE version, recursive group calls are no more atomic (backtracking is possible, as in Perl) and your pattern works as it: https://3v4l.org/HUICY

    Note that actually, regex101 and PHP < 7.3 use older PCRE versions in which recursive group calls are always atomic.