Search code examples
javaregexstringreplaceall

Java String.replaceAll not replacing concurrent repeated sequences


I'm attempting to use String.replaceAll to chop some whitespace out of a string. However, when multiple concurrent instances of the regex pattern appear in the string, only every 2nd one is replaced.

Very simple example:

String theString = "foo x x x x x bar";        
String trimmed = theString.replaceAll("x\\s*x", "xx");        
System.out.println(theString);
System.out.println(trimmed);

What I want to see:

foo x x x x x bar
foo xxxxx bar

What I see:

foo x x x x x bar
foo xx xx x bar

It seems that replaceAll doesn't consider the replacement text as a candidate for being itself replaced, and instead skips merrily onwards.

Is there an easy fix for this?


Solution

  • The problem is that you match the x after the space; as such, after your first match, you match:

    foo x x x x x bar
           ^
           |---- HERE
    

    You don't want to swallow it; you have to use a lookahead:

    .replaceAll("x\\s+(?=x)", "x");
    

    You could even go with both a lookahead and a lookbehind:

    .replaceAll("(?<=x)\\s+(?=x)", "");
    

    (note that the * quantifier has been replaced with +; it allows not to match where there are no space characters, in which case you don't want to replace anyway)