Using PCRE v8.42, I am trying to abstract a regex into a named subroutine, but when it's in a subroutine, it seems to behave differently.
This outputs 10/
:
echo '10/' | pcregrep '(?:0?[1-9]|1[0-2])\/'
This outputs nothing:
echo '10/' | pcregrep '(?(DEFINE)(?<MONTHNUM>(?:0?[1-9]|1[0-2])))(?&MONTHNUM)\/'
Are these two regular expressions not equivalent?
In versions of PCRE2 prior to 10.30, all subroutine calls are always treated as atomic groups. Your (?(DEFINE)(?<MONTHNUM>(?:0?[1-9]|1[0-2])))(?&MONTHNUM)\/
regex is actually equal to (?>0?[1-9]|1[0-2])\/
. See this regex demo, where 10/
does not match as expected.
There is no match because 0?[1-9]
matched the 1
in 10/
and since there is no backtracking allowed, the second alternative was not tested ("entered"), and the whole match failed as there is no /
after 1
.
You need to make sure the longer alternative comes first:
(?(DEFINE)(?<MONTHNUM>(?:1[0-2]|0?[1-9])))(?&MONTHNUM)/
See the regex demo. Note that in the pcregrep
pattern, you do not need to escape /
.
Alternatively, you can use PCRE2 v10.30 or newer.