Why does the ICU use this aliasing barrier when doing a reinterpret_cast?

I'm porting code from ICU 58.2 to ICU 59.1 where they changed the character type from a uint16_t to a char16_t. I was going to just do a straight reinterpret_cast where I needed to convert the types, but found that the ICU 59.1 actually provides functions for this conversion. What I don't understand is why they need to use this anti aliasing barrier before doing a reinterpret_cast.

#elif (defined(__clang__) || defined(__GNUC__)) && U_PLATFORM != 
U_PF_BROWSER_NATIVE_CLIENT
#   define U_ALIASING_BARRIER(ptr) asm volatile("" : : "rm"(ptr) : "memory")
#endif

...

    inline const UChar *toUCharPtr(const char16_t *p) {
#ifdef U_ALIASING_BARRIER
    U_ALIASING_BARRIER(p);
#endif
    return reinterpret_cast<const UChar *>(p);

Why wouldn't it be safe just to use reinterpret_cast without calling U_ALIASING_BARRIER?

Solution

At a guess, it's to stop any violations of the strict aliasing rule, that might occur in calling code that hasn't been completely cleaned up, from resulting in unexpected behaviour when optimizing (the hint to this is in the comment above: "Barrier for pointer anti-aliasing optimizations even across function boundaries.").

The strict aliasing rule forbids dereferencing pointers that alias the same value when they have incompatible types (a C notion, but C++ says a similar thing with more words). Here's a small gotcha: char16_t and uint16_t aren't required to be compatible. uint16_t is actually an optionally-supported type (in both C and C++); char16_t is equivalent to uint_least16_t, which isn't necessarily the same type. It will have the same width on x86, but a compiler isn't required to have it tagged as actually being the same thing. It might even be intentionally lax with assuming types that typically indicate different intent could alias.

There's a more complete explanation in the linked answer, but basically given code like this:

uint16_t buffer[] = ...

buffer[0] = u'a';
uint16_t * pc1 = buffer;

char16_t * pc2 = (char16_t *)pc1;
pc2[0] = u'b';

uint16_t c3 = pc1[0];

...if for whatever reason the compiler doesn't have char16_t and uint16_t tagged as compatible, and you're compiling with optimizations on including its equivalent of -fstrict-aliasing, it's allowed to assume that the write through pc2 couldn't have modified whatever pc1 points at, and not reload the value before assigning it to c3, possibly giving it u'a' instead.

Code a bit like the example could plausibly arise mid-way through a conversion process where the previous code was happily using uint16_t * everywhere, but now a char16_t * is made available at the top of a block for compatibility with ICU 59, before all the code below has been completely changed to read only through the correctly-typed pointer.

Since compilers don't generally optimize hand-coded assembly, the presence of an asm block will force it to check all of its assumptions about the state of registers and other temporary values, and do a full reload of every value the first time it's dereferenced after U_ALIASING_BARRIER, regardless of optimization flags. This won't protect you from any further aliasing problems if you continue to write through the uint16_t * below the conversion (if you do that, it's legitimately your own fault), but it should at least ensure the state from before the conversion call doesn't persist in a way that could cause writes through the new pointer to be accidentally skipped afterwards.