Search code examples
regexregex-lookaroundsregex-negation

Regex to match all variables, but exclude uppercase words


I'm a regex beginner, and I'm having some trouble excluding a pattern in my search. I have a domain specific language that uses capitalized words as keywords: These I'd like to ignore, but I'd like to capture all possible variable names.

Example variable names:

  • VarWithCapitals
  • variable
  • var_with_snake_case
  • var_with_{curly}_braces
  • Var_with_The_{kitchen123}_Sink

Some example keywords:

  • CMD
  • DO WHILE ENDWHILE
  • FOR

The regex I have so far matches everything but does not exclude the capitalized keywords: \b[a-zA-Z0-9_{}]*\b

how can I exclude words containing only capitalized words but match my other variable names?


Solution

  • You can use a negative lookahead to exclude the rules you don't want

    \b(?![A-Z]+\b)[a-zA-Z0-9_{}]*\b
    

    Check the proof

    Explains:

    • (?!...) negative lookahead, the following text must NOT match the following rule
    • (?![A-Z]+\b) if the following text is made only with capitalized letters followed by a word boundary, fail the test