Search code examples
c++language-lawyerstrict-aliasing

Do we have a Strict Aliasing rule violation?


Do we have a Strict Aliasing rule violation in this code? I thought that int -> char and int -> std::byte are OK, but what about int8_t?

int main() {
    int arr[8] = {1, 1, 1, 1, 1, 1, 1, 1}; // Little endian: 10000000 | 10000000 ....
    int8_t *ptr = (int8_t*)arr; // Do we have a violation of the Strict Aliasing rule in this cast?
    ptr += 3; // 100[0]0000 ....

    cout << (int)*ptr << endl; // outputs 0
    return 1;
}

Solution

  • int8_t *ptr = (int8_t*)arr;
    

    This is perfectly fine and equivalent to a reinterpret_cast. The cast in itself is never a strict aliasing violation; the access of the pointed-to object is. See What is the strict aliasing rule? for more details.

    cout << (int)*ptr << endl; // outputs 0
    

    This is undefined behavior ([basic.lval] p11) because you are accessing an int through a glvalue of type int8_t. There are only exceptions for char, unsigned char, and std::byte, and int8_t is most likely an alias for signed char.

    ptr += 3;
    

    This is undefined behavior:

    For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T and the array element type are not similar, the behavior is undefined.
    [Example 1:

    int arr[5] = {1, 2, 3, 4, 5};
    unsigned int *p = reinterpret_cast<unsigned int*>(arr + 1);
    unsigned int k = *p;            // OK, value of k is 2 ([conv.lval])
    unsigned int *q = p + 1;        // undefined behavior: p points to an int, not an unsigned int object
    

    — end example]

    - [expr.add] p6

    What you're doing is effectively the same, just with int8_t instead of unsigned int.

    I thought that int -> char and int -> std::byte are OK, but what about int8_t?

    The cast itself is always fine, but only std::byte or char have special properties that would also make p += 3 valid(1).

    int8_t is most likely an alias for signed char, which has significantly less magical powers compared to unsigned char and std::byte. See the appendix on this answer for a summary of magical powers.


    (1) The standard wording is defective and does not support that; see Is adding to a "char *" pointer UB, when it doesn't actually point to a char array?.