Search code examples
javascriptjavaregexregex-lookaroundsregular-language

Find specific text between words using Regex


I'm trying to replace some string but I have the condition which is that this string must be inside a "tag". How could I do this using regex?

For example:

Text multiline, bla bla bla **FOO** text text text 
*START_TAG* text text  text text **FOO** a lot of texts
**FOO**  more text
*END_TAG*

I would like to replace FOO text which is between START_TAG and END_TAG

I tried doing something like this:

(?<=word1)(.*?)(?=word2)

or

(?<=word1)FOO(?=word2)

But in the first case I get everything inside the tag and in the second nothing is found.

I searched a lot but people use to search for a string inside parentheses or all text between word, etc.

I'm using Java for doing this, but could also be in javascript.


Solution

  • In Java, you may use a one-regex solution like

    String result = s.replaceAll("((?:\\G(?!\\A)|START_TAG)(?:(?!START_TAG|FOO).)*?)FOO(?=.*END_TAG)", "$1<REPLACED>");
    

    See the regex demo.

    Details

    • ((?:\\G(?!\\A)|START_TAG)(?:(?!START_TAG|FOO).)*?) - Group 1:
      • (?:\\G(?!\\A)|START_TAG) - the end of the previous match or START_TAG
      • (?:(?!START_TAG|FOO).)*?) - any char, 0+ repetitions, as few as possible, that does not start a START_TAG and FOO char sequences
    • FOO - a FOO to match and replace
    • (?=.*END_TAG) - a positive lookahead to check there is END_TAG to the right of the current location.

    In JS, a two step replacement seems to be best:

    var rx = /START_TAG[\s\S]*?END_TAG/g;
    var str = "Text multiline, bla bla bla **FOO** text text text *START_TAG* text text text text **FOO** a lot of texts\n**FOO**  more text\n*END_TAG*";
    var result = str.replace(rx, function ($0) {return $0.replace(/FOO/g, "<REPLACED>");} );
    console.log(result);