I am trying to come up with a regular expression to replace specific words irrespective of position / order but it doesn't seem to work
Example input:
This is a a an the a the testing
regex:
(\sa\s)|(\san\s)|(\sthe\s)
Actual output:
This is a the the testing
Expected output:
This is testing
Your regex fails to match some a
or an
or the
substrings, it's mainly because of the overlapping matches.That is, in this string foo an an an
, the above regex would match the first <space>an<space>
, and it won't match the second an
because the first match also consumes the space which exits before the second an
.
string.replacaAll("\\s(?:an|the|a)(?=\\s)", "");
The above regex would fail if any one of the strings would present at the last. In that case, you could use this,
String test = "a an the an test is a success and an example";
System.out.println(test.replaceAll("\\s(?:an|the|a)(?=\\s|$)|^(?:an|the|a)(?=\\s)", "").trim());
Output:
test is success and example