Given the following text:
My name is foo.
My name is bar.
With the goal being to return each line which contains or does not contain a particular substring, both of the following positive and negative regex patterns can be used to return the same result:
Postive lookahead: ^(?=.*bar).*$
returns My name is bar.
Negative lookahead: ^((?!foo).)*$
returns My name is bar.
However, why does the negative lookahead need to be nested within multiple sets of parentheses with the qualifier .
and the quantifier *
separated by the parentheses whereas in the positive lookahead, they can be adjacent .*
?
The negative lookahead need to be nested within multiple sets of parentheses with the qualifier .
and the quantifier *
is called a tempered greedy token. You do not have to use it in this scenario.
You can use a normal lookahead anchored at the start instead of the tempered greedy token:
^(?!.*foo).*$
See the regex demo
Here,
^
- matches the location at the start of the string(?!.*foo)
- a negative lookahead failing the match if there is foo
somewhere on the line (or string if DOTALL
mode is on).*$
- any 0+ characters (but a newline if DOTALL
mode is off) up to the end of string/line.What to use?
Tempered greedy token is usually much less efficient. Use the lookahead anchored at the start when you just need to check if a string contains something or not. However, the tempered greedy token might be required in some cases. See When to Use this Technique.