Is viewing an integer as an array of smaller integers UB?
For example, is there UB in this code:
#include <iostream>
#include <cstdint>
#include <algorithm> // sort
void sort_bytes(std::uint32_t& x) {
std::uint8_t* p = (std::uint8_t*)&x;
std::sort(p, p+4);
}
void sort_words(std::uint32_t& x) {
std::uint16_t* p = (std::uint16_t*)&x;
std::sort(p, p+2);
}
int main() {
const std::uint32_t x = 1234342542u;
std::uint32_t y = x, z = x;
std::cout << x << std::endl;
sort_bytes(y);
std::cout << y << std::endl;
sort_words(z);
std::cout << z << std::endl;
}
Yes, that's exactly what the strict aliasing rule forbids. However, compilers generally have options to disable reliance on this rule, e.g. -fno-strict-aliasing
for GCC and Clang, with which the behavior will be defined in the practical sense on these compilers. Otherwise the these compilers will compile it in unintended ways because the aliasing rule is broken, causing UB.
sort_bytes
is probably an exception. uint8_t
is typically unsigned char
. unsigned char
specifically has an exception to the aliasing rule and may alias any other type (but not the other way around).
Therefore sort_bytes
is fine from that perspective. However, what value exactly will be read is implementation-defined and dependent on the representations of the involved types. In practice it is especially relevant whether the system is litte-endian or big-endian (or something else entirely).
However, even with the aliasing exception, the standard technically at the moment doesn't permit treating the result of the cast as pointing into an array. That's however probably a defect that will be resolved eventually.