Search code examples
phpregexpcre

How can I match this array-like notation using regex in PHP?


I'm trying to match the following array-like pattern with regex:

foo[bar][baz][bim]

I almost have it with the following regex:

~([^[]+)(?:\[(.+?)\])*~gm

However, the capturing groups only include:

Full match: foo[bar][baz][bim]
Group 1: foo
Group 2: bim

I can't figure out why it's only capturing the last occurrence of the [] structure. I'd like it capture foo, bar, baz, and bim in this case.

Any ideas on what I'm missing?


Solution

  • Repeated capturing groups in PCRE don't remember the values of each previous pattern. For this you need to invoke \G token:

    (?|(\w+)|\G(?!\A)\[([^][]*)\])
    

    See live demo here

    Regex breakdown:

    • (?| Start of a branch reset group
      • (\w+) Capture word characters
      • | Or
      • \G(?!\A) Conitnue from where previous match ends
      • \[ Match an opening bracket
      • ([^][]*) Capture any thing except [ and ]
      • \] Match a closing bracket
    • ) End of cluster

    PHP code:

    preg_match_all('~(?|(\w+)|\G(?!\A)\[([^][]*)\])~', 'foo[bar][baz][bim]', $matches);
    print_r($matches[1]);