Search code examples
javautf-8jar

JAR not able to encode languages properly


I'm building a web app with Javalin. Everything is working fine until I started running the JAR directly from command prompt. I noticed all the emojis, international languages and custom font are appearing as "??????" in my JSON response. So I started testing with a simple string in my main method & tried to print it out in console and this is the results I'm getting:

public class App {

        public static void main(String[] args) {

                String myString = "Привет мир . مرحبا بالعالم .Olá Mundo";

                byte bytes[] = myString.getBytes("ISO-8859-1"); 
                String value = new String(bytes, "UTF-8"); 

                byte bytes2[] = myString.getBytes("UTF-8"); 
                String value2 = new String(bytes2, "UTF-8"); 

                System.out.println("value: " + value);
                System.out.println("value 2: " + value2);
        }

} 

Try #1 java -jar app.jar Try #2 java -Dfile.encoding=UTF-8 -jar app.jar and this is the output that I'm getting for both command:

value: ??Ñ????²?µÑ? ?¼??Ñ? . ???±?????? ?????????????? .Ol?¡ Mundo
value2: ????????? ???? . ?????? ?????????? .Olá Mundo

I followed through this answer to get to this point

I noticed that this issue doesn't occur when I run from gradle task but as soon as I'm using jar directly then it causes this problem. Note that this is the first time I'm using Java in backend. I'm mostly a Node JS developer where it's not a big issue with managing emoji & international fonts. Any help will be greatly appreciated. Thanks

UPDATE

I tested this with API

App.java

public class App {
        public static void main(String[] args) {
            Javalin app = Javalin.create().start(7000);
            app.get("/api/test", Models::config);
        }
}

Models.java

public class Models {
        public static void config(Context context) {
            String myString = "Привет мир . مرحبا بالعالم .Olá Mundo";
            byte[] bytes2 = myString.getBytes();
            String value2 = new String(bytes2, StandardCharsets.UTF_8);

            context.result(value2);
        }
}

Now running jar gives this output in browser:

?????? ??? . ????? ??????? .Olá Mundo

and if running jar with command #2 then this is the output:

ะ?ั?ะธะฒะตั? ะผะธั? . ู?ุฑุญุจุง ุจุงู?ุนุงู?ู? .Olรก Mundo

Solution

  • It turns out the way I was creating the fatJar was causing this problem. I changed it to using shadowJar plugin and now everything is working normally except the text with custom font are getting formatted to normal text. This is fine in my case.

    plugins {
        id 'application'
        id "com.github.johnrengelman.shadow" version "7.1.2" //use this for fatjar
    }
    

    and then

    jar {
        manifest {
            attributes 'Main-Class': 'com.example.myapp.App'
        }
    }
    

    and finally building the jar with this command:

    gradle shadowJar