Search code examples
c#regexregex-lookarounds

Regex for matching specific words or their combination as well as the exception word


I would like to check source code scripts according to such a pattern:

'DELETE' or 'FROM' or 'DELETE FROM' followed by a space followed by any word not followed by another word separated by a dot, except for the word 'DUAL'

Something like

'(DELETE|FROM|DELETE FROM) \w+', where \w+ != DUAL
but not
'(DELETE|FROM|DELETE FROM) \w+\.\w+'

Examples

text fragment desired result
begin DELETE tbl1; DELETE tbl1
select FROM tbl2) loop FROM tbl2
fnc(); DELETE FROM tbl3 where DELETE FROM tbl3
qqq DELETE DUAL; www
eee FROM DUAL rrr
ttt DELETE FROM DUAL where
yy DELETE sch1.tbl1; uuu
iii FROM sch2.tbl2 ooo
ppp DELETE FROM sch3.tbl3 aaa

My guess

(FROM|DELETE( FROM)?) (?!DUAL)(?!\w+\.)\w+

matches too much by the 1st part. Is it correct in its 2nd part (after space)?


Solution

  • You might use

    \b(?>DELETE(?: FROM)?|FROM) (?!DUAL\b)\w+\b(?!\.)
    

    The pattern matches:

    • \b Word boundary to prevent a partial word match
    • (?>DELETE(?: FROM)?|FROM) Atomic group (No backtracking) to match either DELETE with optional FROM or only FROM
    • (?!DUAL\b) Negative lookahead to assert not the word DUAL directly to the right followed by a word boundary
    • \w+\b Math 1+ word characters and a word boundary
    • (?!\.) Negative lookahead, assert not a dot directly to the right of the current position

    .NET Regex demo