Search code examples
unicodeutf-8ucs2

What's the longest UTF-8 encoded character in bytes that is also present in UCS-2?


I'd like to know which Unicode character is both present in UCS-2 and UTF-8 encoding, that has the longest size in bytes in UTF-8.


Solution

    • UCS-2 can encode only codepoints in the range from U+0000 to U+FFFF
    • UTF-8 needs at most 3 bytes to encode values in this range.

    So the UCS-2-encodable codepoints with the longest representation in UTF-8 would be U+0800 to U+FFFF.