Search code examples
regex

Regex match all words but not in quotes and not list of words


I would like to match all words but not in quotas or escaped quotas and no some words.

My current regex

\b(?![[^"]*"[^"]*"]|AND|OR|NOT\b)\w+/gm

Example phrase:

Visualise AND Visionary OR "Experiences Tester" OR (project AND manager)

My expected result is get following matches.

Visualise, Visionary, project, manager

But right now i receive Visualise, Visionary, Experiences Teste, project, manager

How to right a correct regex?

Thank you.


Solution

  • For the exact task described in the question, you can use

    (?:\b(?:AND|OR|NOT)\b|"[^"\\]*(?:\\.[^"\\]*)*")(*SKIP)(*F)|\w+
    

    See the regex demo.

    Details:

    • (?:\b(?:AND|OR|NOT)\b|"[^"\\]*(?:\\.[^"\\]*)*")(*SKIP)(*F) - match and skip:
    • \b(?:AND|OR|NOT)\b - a AND, OR, NOT as whole words
    • "[^"\\]*(?:\\.[^"\\]*)*" - any string inside double quotes that can possibly contain an escaped double quote
    • | - or
    • \w+ - match one or more word chars.

    Remember that backslashes must be double escaped inside PHP single/double quoted string literals, see the regex101 code snippet.