Search code examples
javautf-8utf-32

java: UTF-32 to UTF-16 convertor


I'm trying to get the java escape code.
Example: 1F612 ==> \ud83d\ude12

I tried:

String toConvert = "\ud83d\ude12";
String result = "";
for(int x=0;x<toConvert.length();x++){
    int codePoint = Character.codePointAt(toConvert, x);
    String hexStr = Integer.toHexString(codePoint);
    hexStr = formatUTF(hexStr);
    result += hexStr;
}
System.out.println(result);

formatUTF function:

public static String formatUTF(String hex){
    String text = hex;
    for(int x = 0; x<4-hex.length();x++)
        text = "0"+text;
    return "\\u"+text;
}

but the output:

run:
\u1f612\ude12

Note: 1F612 Hex = 128530 Integer

please help.


Solution

  • Maybe this clarifies all.

        for (int i = 0; i < toConvert.length(); ) {
            int codePoint = Character.codePointAt(toConvert, i);
            i += Character.charCount(codePoint);
            System.out.printf("[%d] cp: %x%n", i, codePoint);
        }
        for (int i = 0; i < toConvert.length(); ++i) {
            char ch = toConvert.charAt(i);
            System.out.printf("[%d] c: %x%n", i, (int)ch);
        }
    

    It yields one single code point of two 16 bit chars.

        [2] cp: 1f612
        [0] c: d83d
        [1] c: de12
    

    Exactly as the UTF-16 standard says.