I need a regex that can match on an incorrect AND
/ OR
logic statements but not if they are in quotes for example:
No matches should be found in:
MAR AND SATURN
MAR OR SATURN
"MAR AND SATURN"
There won't be any matches if AND
or OR
have at least 1 white space character plus 1 non-white space character on both sides and the characters are not made up of OR
or AND
. So for example ..R AND S.. should not match but (OR) OR (OR)
or (AND) AND (AND)
should.
MARS AND SATURN [AND]
MARS [OR]
MARS [ OR ]
[AND] AND [AND]
[OR] [AND]
[OR] [AND]
[AND] [OR]
[ AND ] [ OR ]
You will notice some examples contain white spaces before, after or on both sides of the AND
or OR
operator, these also need to match.
I'm using the .NET framework and this is what I came up with which works. However, it seems too complicated! There has to be a way to simplify it.
(?!.*\"")(?<!(?:\bAND\b\s|\bOR\b\s))(?:\b(?:AND|OR)\b)(?=\s\b(?:AND|OR)\b)|(?<=\bAND\b\s|\bOR\b\s)(?:\b(?:AND|OR)\b)(?!\s\b(?:AND|OR)\b)|^\b(?:AND|OR)\b|(?:AND\s?|OR\s?)$|(?<=\()\s?(?:\bAND\b|\bOR\b)|(?<=\()(?:\bOR|\bAND)(?=\))|(?:\bOR|\bAND)(?=\))(?!.*\"")
I think this will do:
^ *'[^']*' *$|^ *"[^"]*" *$|(\b(AND|OR)\b) +(?1)|(?1)\s*$|^\s*(?1)
Demo: https://regex101.com/r/nD9yR3/2
This regex is to match the wrong string!!!
(?1)
is for recursive regex. It repeats regex of group 1.^ *'[^']*' *$|^ *"[^"]*" *$|
is for ignoring string inside quotes. It's considered a match if it has value for group 1, not group zero.