Search code examples
c++java-native-interfacestdstring

convert std::string to jstring encoded using windows-1256


I am using a library (libcurl) that requests a certain webpage with some Arabic content. when I obtain the string response it has Arabic characters and the whole response is encoded in WINDOWS-1256.

the problem is arabic chars dont show up properly.

is there a way to convert an std::string to a jstring encoded in WINDOWS-1256?

by the way I tried env->NewStringUTF(str.c_str()); and the application crashed.


Solution

  • Java strings use UTF-16. JNI has no concept of charset encodings other than UTF-8 and UTF-16 (unless you use JNI calls to access Java's Charset class directly, but Java only implements a small subset of charsets, and Windows-1256 is not one of them unless the underlying Java JVM specifically implements it).

    JNI's NewStringUTF() function requires UTF-8 input (and not just standard UTF-8 but Java's special modified UTF-8) and returns a UTF-16 encoded JString.

    So you would have to first convert the original Arabic data from Windows-1256 to (modified) UTF-8 before then calling NewStringUTF(). A better option would be to convert the data to UTF-16 directly and then use JNI's NewString() function. But either way, you can use libiconv, ICU4JNI, or any other Unicode library of your choosing for the actual conversion itself one way or the other.