Search code examples
.netregexregex-lookaroundslookbehind

Regex positive look-ahead not preventing a match


If this regex:

^(?:(?:\([^\(\)]+\))|(?:(?<!\()[^\(\)]+(?!\))))$

matches abc and (abc) but not (abc or abc), why can't I use it in a positive look-ahead like this?

^(?=(?:(?:\([^\(\)]+\))|(?:(?<!\()[^\(\)]+(?!\)))))(?:\(?[a-z]+\)?)$

It matches abc) for example.


Solution

  • Your first regex can be reduced to ^(?:\([^()]+\)|[^()]+)$. When you used it in the lookahead, you did not anchor it at the end, you did not use $. So, the direct "quick fix" would look like

    ^(?=(?:\([^()]+\)|[^()]+)$)\(?[a-z]+\)?$
    

    See the regex demo.

    The second regex can also be written as mere ^(?:\([a-z]+\)|[a-z]+)$, with two alternatives that either matches a lowercase letter string inside parentheses or without them.

    In .NET, you may also use

    ^(\()?[a-z]+(?(1)\))$
    

    See demo.

    Details

    • ^ - start of string
    • (\()? - an optional capturing group #1 matching a (
    • [a-z]+ - 1+ lowecase letters (\p{Ll}+ matches any lowercase Unicode letters)
    • (?(1)\)) - a conditional construct: if Group 1 matched (if there was an open parenthesis) match )
    • $ - end of string.