Search code examples
regexregex-lookarounds

Conditional Regex with lookarounds


I'm attempting to write regex that captures the values of data in key/value pair format. Unfortunately I wasn't able to find exactly what I needed but I suspect it has to do with conditional look arounds. I'm also not sure this is the best solution.

The key/value pair will look like the following:

  • ... source=<value> ... -- No quotes/spaces
  • ... source="<value with spaces>" ... -- quotes with spaces
  • ... source="<value>" ... -- quotes with no spaces

I'm white listing characters with the expression \bsource(::|=)([0-9a-zA-Z_\-\*\"\:\.\/]+). If there are spaces in the value then the first word is captured and nothing after or if I whitelist spaces then I capture more than needed. To match the value while avoiding the surrounding double quotes would be awesome too!

Data samples:

... source="source name with quotes - special characters also" ...

... source=source_name_without_quotes_with_special-characters* ...

... source="source_name_with_quotes_no_spaces-*" ...

Any assistance or guidance would be extremely helpful, thanks in advance!

~Tensore


Solution

  • A conditional expression would look like this

    \bsource(?::|=)(")?(?(1)(?P<value1>[^"]+)"|(?P<value2>\S+))
    

    See a demo on regex101.com.


    But you do not really need it here, use a simple alternation instead:

    \bsource(?::|=)(?:"(?P<value1>[^"]+)"|(?P<value2>\S+))
    

    See another demo on regex101.com.


    You could even use a branch reset for the same name of the group:

    \bsource(?::|=)(?|"(?P<value>[^"]+)"|(?P<value>\S+))
    

    See the last demo on regex101.com.