Search code examples
phpregexpreg-match

Matching 2 regex patterns with 1 pattern in the front or back


I have a regex that matches 2 patterns with 1 pattern in the front or back, but the first array return 2 empty indexes. Why is it doing that, and how would I stop it from doing that?

$text = "i did";
preg_match("~(?:(did) (.+)|(.+) (did))~", $text, $match);
print_r($match);

echo "<br>";

$text = "did i";
preg_match("~(?:(did) (.+)|(.+) (did))~", $text, $match);
print_r($match);

Result:

Array ( [0] => i did [1] => [2] => [3] => i [4] => did ) 
Array ( [0] => did i [1] => did [2] => i )

Want Result:

Array ( [0] => i did [1] => i [2] => did ) 
Array ( [0] => did i [1] => did [2] => i )

Solution

  • You can use a branch reset (?|...):

    Alternatives inside a branch reset group share the same capturing groups. The syntax is (?|regex) where (?| opens the group and regex is any regular expression.

    Your preg_match will look like:

    preg_match("~(?|(did) (.+)|(.+) (did))~", $text, $match);
    

    See IDEONE demo

    Results:

    Array
    (
        [0] => i did
        [1] => i
        [2] => did
    )
    

    I guess your regex is a sample one. If you need to match a word after or before did, use the \w shorthand class:

    preg_match("~(?|(did) (\w+)|(\w+) (did))~", $text, $match);
    

    See another demo