I'm trying to implement gmail style filters in my search and I'm stuck at this regex problem. I need to capture ONE word OR two words in quotes (but without the quotation marks themselves) This is PCRE (PHP)
ie.
name:mark
desired result: 1st capture group should be mark
name:"mark"
desired result: 1st capture group should be mark
name:"mark wilson"
desired result: 1st capture group should be mark, second capture group should be wilson
name:mark wilson
desired result: 1st capture group should be mark, wilson is ignored
The closest I've gotten is name:(\w+|\"\w+(?>\"|\s([a-z.'-]+\"))) it captures example 1 perfectly, but example 2 still includes the quotes, and example 3 ends up as:
group 1: "mark wilson" (quotes included)
group 2: wilson" (quote included)
I've tried lookahead and lookbehinds but I'm not getting anywhere with those either
any help would be very appreciated. tia
1 option could be using an if/else clause which will give mark in group 2 and wilson in group 3. The first group will capture the "
which can be used for the if else checking for the existence for group 1.
\w+:(")?(\w+(?:\h+(\w+))?)(?(1)")
If the space after the first name should not be there, you could also group that and have the values in group 3 and 4
\w+:(")?((\w+)(?:\h+(\w+))?)(?(1)")
You could also get either the single value between quotes or not, or capture the first or second name in a capturing group using a branch reset group
\w+:(?|"(\w+)(?:\h+(\w+))?"|(\w+))
Explanation
\w+:
Match 1+ word chars(?|
Branch reset group
"(\w+)
Capture group 1, match 1+ word chars(?:
Non capture group
\h+
match 1+ horizontal whitespace chars(\w+)
Capture group 2, match 1+ word chars)?
Close group and make optional"
Match "
|
Or(\w+)
Capture group 1, match 1+ word chars)
Close branch reset group