Search code examples
javaregexstringunicodereplaceall

Escaping double-slashes with regular expressions in Java


I have this unit test:

public void testDeEscapeResponse() {
    final String[] inputs = new String[] {"peque\\\\u0f1o", "peque\\u0f1o"};
    final String[] expected = new String[] {"peque\\u0f1o", "peque\\u0f1o"};
    for (int i = 0; i < inputs.length; i++) {
        final String input = inputs[i];
        final String actual = QTIResultParser.deEscapeResponse(input);
        Assert.assertEquals(
            "deEscapeResponse did not work correctly", expected[i], actual);
    }
}

I have this method:

static String deEscapeResponse(String str) {
    return str.replaceAll("\\\\", "\\");
}

The unit test is failing with this error:

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
    at java.lang.String.charAt(String.java:686)
    at java.util.regex.Matcher.appendReplacement(Matcher.java:703)
    at java.util.regex.Matcher.replaceAll(Matcher.java:813)
    at java.lang.String.replaceAll(String.java:2189)
    at com.acme.MyClass.deEscapeResponse
    at com.acme.MyClassTest.testDeEscapeResponse

Why?


Solution

  • Use String.replace which does a literal replacement instead of String.replaceAll which uses regular expressions.

    Example:

    "peque\\\\u0f1o".replace("\\\\", "\\")    //  gives  peque\u0f1o
    

    String.replaceAll takes a regular expression thus \\\\ is interpreted as the expression \\ which in turn matches a single \. (The replacement string also has special treatment for \ so there's an error there too.)

    To make String.replaceAll work as you expect here, you would need to do

    "peque\\\\u0f1o".replaceAll("\\\\\\\\", "\\\\")