Search code examples
regex-lookaroundsregex-negation

How to use regex to unmatch the line if certain word exists on that line?


I have the following regex

(?!.*internal).*auditor['’s]*.*(?=report)|(?!.*internal)(?<=report).*auditor['’s]*.*

and the following test cases

report of auditor
report of external auditor
auditor external report
in auditor report
auditor report
internal report of auditor
report of internal auditor
auditor internal report

I want to match if there is report before or after auditor['’s]* but I do not want to match if the word internal presents

with my above regex internal report of auditor will be matched.

Here is the desired result

report of auditor
report of external auditor
auditor external report
in auditor report
auditor report

Here is the regex101


Solution

  • The "'s" suffix to "auditor" seems irrelevant, so remove that unnecessary complication.

    You requirement can be expressed as:

    • contains "auditor"
    • contains "report" (because "before or after something" just means "contains" - the "something" is irrelevant)
    • does not contain "internal"

    Putting that in to regex:

    ^(?!.*\binternal\b)(?=.*\breport\b).*\bauditor\b.*
    

    I put word boundaries (\b) around the terms, so for example "internalization" and "reporting" aren't matches.

    See live demo, showing this matching all but the last 3 lines of your sample input.