Search code examples
phpregexparsingregex-group

Regex - match an optionally repeated pattern of capture groups


Appologies for not knowing exactly how to word this question. There is probably even a better title. I'm open to suggestions.

I have the following subjects:

(Field1 = 'Value1') and (Field2 = 'Value2')

and

(Field1 = 'Value1') and (Field2 = 'Value2') or (Field3 = 'Value3')

I want to match in such a way that I have each thing between the () in groups and each conjunction in a group. So, for the second one, some variation of

0: Field1 = 'Value1'
1: and
2: Field2 = 'Value2'
3: or
4: Field3 = 'Value3'

The good news is, I've got regex that works on the first:

\(([A-Za-z0-9\s\'=]+)\) (and|or) \(([A-Za-z0-9\s\'=]+)\)

https://regex101.com/r/hMXAXS/1

But (on the second subject) it doesn't match the third "and ()". I need to support arbitrary numbers of groups. I can modify it to just look for "and ()" but then it doesn't match the first group.

How can I tell regex to do this? I either need to "double count" some groups (which is fine) or have some other way of optionally looking for additional patterns and matching them.

Thanks for the help!

PS: I was able to get my application to work with the regex ((and|or) \(([A-Za-z0-9\s\'=]+)\))+ and then just accepting that the first group would never match and creating application logic to support this. Still, I'd bet there's a better way.


Solution

  • If you are OK with getting three groups per match...

    1 = key 2 = value 3 = conjunction verb

    Then this regex will also allow parenthesis in the value.

    /\((.*?) = '(.*?)'\) ?(and|or)?/gm
    

    Which results in these matches for this string...

    (Field1 = 'Value1') and (Field2 = '(in parenthesis)') and (Field3 = 'Value3')

    enter image description here