Search code examples
javaregexreplaceall

Multiple regex for replacing characters in java


I have the following string:

String str = "Klaße, STRAßE, FUß";

Using of combined regex I want to replace German ß letter to ss or SS respectively. To perform this I have:

String replaceUml = str
        .replaceAll("ß", "ss")
        .replaceAll("A-Z|ss$", "SS")
        .replaceAll("^(?=^A-Z)(?=.*A-Z$)(?=.*ss).*$", "SS");

Expected result:

Klasse, STRASSE, FUSS

Actual result:

Klasse, STRAssE, FUSS

Where I'm wrong?


Solution

  • String replaceUml = str
        .replaceAll("(?<=\\p{Lu})ß", "SS")
        .replace("ß", "ss")
    

    This uses regex with a preceding unicode upper case letter ("SÜß"), to have capital "SS".

    The (?<= ... ) is a look-behind, a kind of context matching. You could also do

        .replaceAll("(\\p{Lu})ß", "$1SS")
    

    as ß will not occure at the beginning.

    Your main trouble was not using brackets [A-Z].