Search code examples
javaregextextpad

Replace specific words using regular expression (globally) - Java


I am trying to come up with a regular expression to replace specific words irrespective of position / order but it doesn't seem to work

Example input:

This is a a an the a the testing

regex:

(\sa\s)|(\san\s)|(\sthe\s)

Actual output:

This is a the the testing

Expected output:

This is testing

Solution

  • Your regex fails to match some a or an or the substrings, it's mainly because of the overlapping matches.That is, in this string foo an an an, the above regex would match the first <space>an<space>, and it won't match the second an because the first match also consumes the space which exits before the second an .

    string.replacaAll("\\s(?:an|the|a)(?=\\s)", "");
    

    DEMO

    The above regex would fail if any one of the strings would present at the last. In that case, you could use this,

    String test = "a an the an test is a success and an example";
    System.out.println(test.replaceAll("\\s(?:an|the|a)(?=\\s|$)|^(?:an|the|a)(?=\\s)", "").trim());
    

    Output:

    test is success and example