Search code examples
c#.netastral-plane

How are 4 bytes characters represented in C#


How are 4 bytes chars are represented in C#? Like one char or a set of 2 chars?

var someCharacter = 'x'; //put 4 bytes UTF-16 character

Solution

  • C# can only store characters from the Basic Multilingual Plane in the char type. For characters outside this plane two chars must be used - called surrogates.

    You can also use a string literal such as:

    string s = "\U0001D11E";
    

    See UTF-16.