Search code examples
c++undefined-behavior

Is viewing an integer as an array of smaller integers UB?


Is viewing an integer as an array of smaller integers UB?

For example, is there UB in this code:

#include <iostream>
#include <cstdint>
#include <algorithm> // sort

void sort_bytes(std::uint32_t& x) {
    std::uint8_t* p = (std::uint8_t*)&x;

    std::sort(p, p+4);
}

void sort_words(std::uint32_t& x) {
    std::uint16_t* p = (std::uint16_t*)&x;

    std::sort(p, p+2);
}


int main() {
    const std::uint32_t x = 1234342542u;
    std::uint32_t y = x, z = x;
    std::cout << x << std::endl;
    sort_bytes(y);
    std::cout << y << std::endl;
    sort_words(z);
    std::cout << z << std::endl;
}

Solution

  • Yes, that's exactly what the strict aliasing rule forbids. However, compilers generally have options to disable reliance on this rule, e.g. -fno-strict-aliasing for GCC and Clang, with which the behavior will be defined in the practical sense on these compilers. Otherwise the these compilers will compile it in unintended ways because the aliasing rule is broken, causing UB.

    sort_bytes is probably an exception. uint8_t is typically unsigned char. unsigned char specifically has an exception to the aliasing rule and may alias any other type (but not the other way around).

    Therefore sort_bytes is fine from that perspective. However, what value exactly will be read is implementation-defined and dependent on the representations of the involved types. In practice it is especially relevant whether the system is litte-endian or big-endian (or something else entirely).

    However, even with the aliasing exception, the standard technically at the moment doesn't permit treating the result of the cast as pointing into an array. That's however probably a defect that will be resolved eventually.