Search code examples
phpregexpreg-match

Catch URL IDs from multiple matches in regexp


I'm writing a simple URL parser. With a regexp like following

preg_match_all('/^test\/(\w+)\/?$/', $url, $matches);

I can catch all URL like

test/5

and browsing $matches array I can get the ID, which is 5. That's fine.

With a regexp like following

preg_match_all('/^test\/((\w+)\/?)+\/(\w+)\/?$/', $url, $matches);

I can catch all URL like

test/1/5
test/1/2/5
test/1/2/3/5

... and so on. The problem is that browsing $matches array I can't catch all the matched IDs of the variable-length part (which is ((\w+)\/?)+). I mean I don't catch 1,2,3 but 3,3,3. I get the last ID repeated N-times.

What am I missing?


Solution

  • I would do this job in two steps.

    First, you can check the URL format:

    ^test(?:\/\d+)+$
    

    See the demo

    Then, if the test succeeds, you can extract the IDs with this regex:

    (?:\G|^test)\/\K\d+
    

    The output array will only contain the IDs.
    See the demo

    Explanation

    • (?:\G|^test) matches the end position of the previous match or test at the beginning of the string
    • \/ matches a /
    • \K resets the starting point of the current match, excluding here the / from result
    • \d+ matches 1 or more digits