Search code examples

Match specific not between tags

I have some regex expressions to put content between tag, as seen on result. If i apply the same regex expressions on the resulted text i will get tags inside tags...


Lorem ipsum 123456 dolor sit @twitter amet, consectetur adipiscing elit example .


Lorem ipsum [tel]123456[/tel] dolor sit [tw]@twitter[/tw] amet, consectetur adipiscing elit [a]example[/a] .


Lorem ipsum [tel][tel]123456[/tel][/tel] dolor sit [tw][tw]@twitter[/tw][/tw] amet, consectetur adipiscing elit [a][a]example[/a][/a] .

What to put in my regex expressions so that will not match if content is between any [] and [/] ?


  • Description


    Replace with: [xx]$0[/XX]

    Regular expression visualization

    This regular expression will do the following:

    • find all the strings of numbers, the word twitter, and the word consectetur. I selected these substrings to illustrate the regular expression but these could be replaced with other strings.
    • verify that the word is not already followed by a close tag
    • avoid edge cases
      • the construct [0-9+] will match 2345 which is in the source string but it may already be wrapped by tags
      • matching twitter without the leading @ still has a trailing tag


    Live Demo

    Sample Text

    123456 Lorem ipsum [tel]123456[/tel] dolor sit [tw]@twitter[/tw] amet, consectetur adipiscing elit [a]example[/a]

    Sample After Replacement

    [XX]123456[/XX] Lorem ipsum [tel]123456[/tel] dolor sit [tw]@twitter[/tw] amet, [XX]consectetur[/XX] adipiscing elit [a]example[/a]


    NODE                     EXPLANATION
      (?:                      group, but do not capture:
        [0-9]+                   any character of: '0' to '9' (1 or more
                                 times (matching the most amount
       |                        OR
        twitter                  'twitter'
       |                        OR
        consectetur              'consectetur'
      )                        end of grouping
      (?!                      look ahead to see if there is not:
        [0-9a-z]*                any character of: '0' to '9', 'a' to 'z'
                                 (0 or more times (matching the most
                                 amount possible))
        \[                       '['
        \/                       '/'
        [a-z]+                   any character of: 'a' to 'z' (1 or more
                                 times (matching the most amount
        \]                       ']'
      )                        end of look-ahead