In Java, I am reading an array of bytes from a file encoded in Shift-JIS format, but the "style" of the characters in the acquired string looks different than normal strings (wider?).
Here is an example of what I mean for the "P" letter:
P - P
As you can see the first one in Shift-JIS looks different than the second one. Is there a way to use "normal" characters even for Shift-JIS strings?
I am using this piece of code to perform the conversion:
String jis = new String(byteArray, Charset.forName("Shift_JIS"));
Strictly speaking, These are different characters. The first P
is the Fullwidth Latin Capital Letter P
in Unicode, from Japanese JIS X 0208 charset (U+FF30). The second P
is the Latin Capital Letter P
from ASCII (U+0050).
So, you have to convert fullwidth characters to halfwidth characters. You can do this with ICU4J's Transliterator.
Transliterator transliterator = Transliterator.getInstance("Halfwidth-Fullwidth");
String result = transliterator.transliterate("P - P");
System.out.println(result); // You will get "P - P"