Search code examples
phpregexpcrebackreference

Named regular expression back reference in PHP


Using PHP 5.5+ regular expressions, I'd like to use named backreferences to capture a group.

For example, I'd like the following texts to match:

1-1
2-2

but not the following

8-7

When I try using backreferences, however, PHP is marking it as match found:

/* This statement evaluates to 1 */
preg_match("/(?<one>[1-9])\-(?<two>\\g<one>)/", "8-7");

Is there a workaround for this other than using numbered references?


Solution

  • See this excerpt from PCRE documentation:

    For compatibility with Oniguruma, the non-Perl syntax \g followed by a name or a number enclosed either in angle brackets or single quotes, is an alternative syntax for referencing a subpattern as a subroutine, possibly recursively.

    Note that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are not synonymous. The former is a back reference; the latter is a subroutine call.

    By using the \g<one>, you do not refer to the match, but to the subpattern, see the explanation on regex101.com.

    \g<one> recurses the subpattern named one

    You need to use \1 to actually match the same text captured in the first group.

    (?<one>[1-9])\-(?<two>\1)
    

    Or (the named back-reference to the actual text),

    (?<one>[1-9])\-(?<two>\g{one})
    

    \1 matches the same text as most recently matched by the 1st capturing group

    See a numbered demo and a named back-reference demo.