java linux character-encoding utf-16 smpp

convert charset in windows and linux

I'm building SMPP gateway that gets byte[] array of indian chars and converts it to readable string that forwarded by email. In Win machine, this code is working:

byte[] data= ....;
shortMessage = new String(data, GSMCharset.forName("UTF-16"));

In Linux however, its give rubbish.

I tried other charset options, but all give me nothing. Any ideas how to make it work on Linux.

(The DataCoding == 8)

Solution

It seems the encoding of the output is controlled by the encoding of the source file. Unless specified at compile time (How can I specify the encoding of Java source files?), the default encoding is inherited from the OS.

I am guessing the Windows machine you used had a default encoding that caused the output you are expecting, while the Linux machine did not. See this question for a similar issue reported - Charset of Java source file and failing test.

I was able to reproduce the behavior. Also found a fix - changing the encoding of the source file. Read on for details.

I ran the following code in two different encodings.

System.out.println(Charset.defaultCharset().toString());
byte[] data = new byte[] {9, 22, 9, 65, 9, 54, 9, 22, 9, 44, 9, 48, 9, 64};
System.out.println(Arrays.toString(data));
System.out.println(new String(data, "UTF-16"));

Using default encoding of OS

In my case, it was "MacRoman" on my mac. The output is this:

MacRoman
[9, 22, 9, 65, 9, 54, 9, 22, 9, 44, 9, 48, 9, 64]
???????

Using UTF-8 encoding

I changed the encoding of the source file (see the "Properties" of the source file). Ran again. The output is this:

UTF-8
[9, 22, 9, 65, 9, 54, 9, 22, 9, 44, 9, 48, 9, 64]
खुशखबरी