Search code examples
regexregex-group

How to match all >< except bold tag


I have a regular expression [&<>] for matching < and > in text.

Like <i>text</i> - will match <, >, <, >,

But I don't want it to match <b> and </b>

How can I do this?

Example: <i>match me</i> <b>don't match me</b> <i>match me</i>

Will match only < and > for italic tags


Solution

  • You can achieve this using negative lookarounds:

    (?<!b)>|(?!<b)(?!</b)<
    

    Demo

    Explanation:

    (?<!b)>|(?!<b)(?!</b)<
           |                # match either
          >                 # a >,
    (?<!b)                  # not preceeded by a b
                         <  # or a <,
            (?!<b)          # not preceeded by a <b
                  (?!</b)   # and neither by </b
    

    Lookaround assertions typically have to have a fixed length, which is why we need two for the opening angle bracket: one for <b>, and one for </b>.