Search code examples
regexregex-lookarounds

Regex negative lookaheads


I am trying to create a regex to check for sentences that are like the following:

Some text {word1} some more text {word2} maybe some more text

The string starts with some text then followed by {word1} (including the curly braces), followed with more text, followed by {word2} (including the curly braces), and optionally some final text. The sentence must contain both {word1} and {word2}, it must have {word1} come before {word2}, and both {word1} and {word2} can only appear once in the sentence each.

I am having a lot of trouble creating a regex for this, mostly because of negative lookaheads. I wanted to see if anyone on here is really good with regex and would be able to create a regex to check for this. Below are also some examples of text that should not pass.

Some text (no word1 or word2)
Some text {word1} some more text (no word2)
Some text {word2} some more text (no word1)
Some text {word1} some more text {word1} some more text {word2} maybe some more text (word1 appears multiple times)
Some text {word2} some more text {word1} maybe some more text (word2 precedes word1)
{word1} some text {word2} maybe some more text (no text preceding word1)
Some text {word1} {word2} maybe some more text (no text between word1 and word2)

One note is that while I am using word1 and word2, I am not specifically using those words in the regex. They can be arbitrary words, but just not the same. So for example, word1 could be "apple" and word2 could be "banana".

Any and all help is greatly appreciated.


Solution

  • Quite complicated, but I think the regex contains some components:

    1. Verify text before {word1}:

      ^\s*((?!\{word[12]\})\S)+((?!\{word[12]\}).)*

    2. Verify text {word1}:

      \{word1\}

    3. Verify text after {word1} and before {word2}:

      \s*((?!\{word[12]\})\S)+((?!\{word[12]\}).)*

    4. Verify text {word2}:

      \{word2\}

    5. Verify text after {word2} to end:

      ((?!\{word[12]\}).)*$

    Full regex :

    ^\s*((?!\{word[12]\})\S)+((?!\{word[12]\}).)*\{word1\}\s*((?!\{word[12]\})\S)+((?!\{word[12]\}).)*\{word2\}((?!\{word[12]\}).)*$