I'm currently developing a program in java, and I want to display Chinese pinyin, which I get from a distant website.
But I have the following problem: Chinese pinyin is displayed this way: jiǎ
Whereas it should be displayed this way: jiǎ
(I just typed the same sequence, except I stripped the slashes).
I think the answer to this question is really simple but I'm struggling to find it.
The problem is you have an HTML encoded Unicode character and what you want is the decoded version of it. A library like commons-lang3 (part of Apache Commons) will take your HTML encoded string and decode it for Java to display like this:
String decoded = StringEscapeUtils.unescapeHtml("jiǎ");
You can also escape Unicode characters in Java like this:
String jia = "ji\u01ce";
This clever one-liner will take a Unicode character and show you its escaped form:
System.out.println( "\\u" + Integer.toHexString('ǎ' | 0x10000).substring(1) );