I am aware of the post: Converting managed System::String to std::string in C++/CLI for the required conversion. But I came across the following code which uses marshal_context
instead. I am trying to understand how it works.
// required header : #include <msclr/marshal.h>
System::String^ str = gcnew System::String(L"\u0105");
msclr::interop::marshal_context ctx;
auto constChars = ctx.marshal_as<const char*>(str);
std::string myString(constChars);
If I am not wrong str
is a single "character" represented by 16 bits using UTF-16, which according to the Unicode list is a small Latin letter a
with an ogonek. But myString
comes out to be a single character ?
. How does this conversion happen?
Moreover why does code work as "expected" when creating str
with a an ASCII character say a
. In UTF-16 a
would be represented in 16 bits, with most/least (depending on endianess) significant 8 bits being all 0
. Why does then myString
have only one char
a
?
A std::string
is a sequence of char
s. A char
can typically only hold ascii characters (in 8 bit). It can overflow when assigned a unicode character value that can exceed 8 bits. When it overflows you get a "garbaged" value.
You need std::wstring
, which contains a sequence of wchat_t
to represent a unicode string.
Therefore change your last 2 lines to:
//-------------------------------------vvvvvvv--------
auto constChars = ctx.marshal_as<const wchar_t*>(str);
//---vvvvvvv----------------------
std::wstring myString(constChars);