Search code examples
windowsunicodebstr

Are BSTR UTF-16 Encoded?


I'm in the process of trying to learn Unicode? For me the most difficult part is the Encoding. Can BSTRs (Basic String) content code points U+10000 or higher? If no, then what's the encoding for BSTRs?


Solution

  • In Microsoft-speak, Unicode is generally synonymous with UTF-16 (little endian if memory serves). In the case of BSTR, the answer seems to be it depends:

    • On Microsoft Windows, consists of a string of Unicode characters (wide or double-byte characters).
    • On Apple Power Macintosh, consists of a single-byte string.
    • May contain multiple embedded null characters.

    So, on Windows, yes, it can contain characters outside the basic multilingual plane but these will require two 'wide' chars to store.