Search code examples
javastringjavafxencodingcharacter-encoding

Encoding string doesn't work properly in java


I am developing a JavaFX application. I need to create a TreeView programmatically using Persian language as it's nodes' name.
The problem is I see strange characters when I run the application. I have searched through the web and SO same questions. I code a function to do the encoding based on the answers to same question:

public static String getUTF(String encodeString) {
        return new String(encodeString.getBytes(StandardCharsets.ISO_8859_1),
                         StandardCharsets.UTF_8);
}

And I use it to convert my string to build the TreeView:

CheckBoxTreeItem<String> userManagement = 
             new CheckBoxTreeItem<>(GlobalItems.getUTF("کاربران"));

This answer dowsn't work properly for some characters:

enter image description here

I still get strange results. If I don't use encoding, I get:

enter image description here


Solution

  • For hard coded string literals you need to tell the javac compiler to use the same encoding as the java source, say UTF-8. Check the IDE / build settings. You can u-escape some Farsi symbols, \u062f for Dal, د. If the escaped characters come thru correctly, the compiler uses the wrong encoding.

    String will always contain Unicode, no new Strings with hacking reconversion needed.

    Reading files with text, one needs to convert those bytes (byte/InputStream) to java text (String/Reader) specifying the encoding of those bytes.