A Java char
is 2 bytes (max size of 65,536) but there are 95,221 Unicode characters. Does this mean that you can't handle certain Unicode characters in a Java application?
Does this boil down to what character encoding you are using?
You can handle them all if you're careful enough.
Java's char
is a UTF-16 code unit. For characters with code-point > 0xFFFF it will be encoded with 2 char
s (a surrogate pair).
See https://www.oracle.com/technical-resources/articles/javase/supplementary.html for how to handle those characters in Java.
(BTW, in Unicode 5.2 there are 107,154 assigned characters out of 1,114,112 slots.)