wchar_t is a UTF-16(LE) formatted character, which is -- for the most part -- equivalent to
char16_t. However, these two character types are still distinct types in the C++ type-system -- which makes me uncertain whether converting between sequences of these two character types is legal as per the C++ standard.
My question is this: In C++17, is it legal to perform the following casts, and to read from the converted pointers:
reinterpret_cast<const wchar_t*>(char16_ptr) where
const char16_t*, and
reinterpret_cast<const char16_t*>(wchar_ptr) where
For the purposes of this question, assume the following:
sizeof(wchar_t) == sizeof(char16_t), and
wchar_t is formatted the same as
char16_t (as is the case on Windows)
Basically, is this a violation of a strict-aliasing?
My understanding that the cast itself is valid thanks to
[expr.reinterpret.cast]/7, but that the result of the cast cannot safely be used since the type is being aliased by something that isn't
unsigned char, or
std::byte. Is this interpretation correct?
Note: Other questions have been asked regarding
char16_t being the same, but this question is not a duplicate of those as far as I can tell. Notably, the question "Are wchar_t and char16_t the same on Windows?" actually performs a
reinterpret_cast between pointers, but none of the answers actually address whether this cast was ever legal in the first place.
You already know the answer to this: strictly speaking, no.
wchar_t is not
char16_t. Neither derives from the other. Neither is similar to the other. Neither is a signed/unsigned version of the other. Neither is an aggregate containing the other.And neither of them is a bytewise type (
So you cannot access a
wchar_t through a pointer/reference to a
If strict avoidance of strict aliasing is your goal, you're going to have to copy the data to a different object. That is valid, assuming they both have the same representation.