Search code examples
cpointersstructmemory-alignmentpointer-arithmetic

Can I move between contiguous sequences of fields of the same type in a struct using pointer arithmetic without alignof?


I am aware that there are other similar questions. I have read through these and don’t think this question has been answered.

Can I move between a consecutive (contiguous in the declaration sequence) sequence of fields of the same type of a struct using pointer arithmetic without consulting _Alignof?

I can move between array elements without consulting _Alignof because trailing padding is included.

However, is it possible that consecutive fields of the same type in a struct are not aligned the same way as in an array?

In other words, is it possible for this code to be undefined behavior?

struct MyStruct {
    long int field_1;
    int field_2;
    int field_3;
};

int main(void) {
    struct MyStruct my_struct;
    int *field_2_ptr = &my_struct.field_2;
    int field_3_value = *(field_2_ptr + 1);
}

I know that this is bad practice. I am aware of ->. I know padding can interfere in general. I want to know if padding can interfere in this specific circumstance.

This question is about C. I don’t care about C++.

So far I have compiled this with GCC and tried Valgrind to see if something is off, and compiled it with clang and UBSan. It seems to be fine, on this system (x86-64 Linux).


Solution

  • Strictly speaking, this is undefined behavior.

    Pointer arithmetic is allowed on array objects (and single objects are treated as a 1-element array for this purpose) provided the original pointer and resulting pointer point to the same array object (or one element past the end). Additionally, a pointer to one-past-the-end may not be dereferenced, otherwise it triggers undefined behavior.

    This is spelled out in sections 6.5.6p7-8 of the C standard regarding Additive operators:

    7 For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type

    8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

    The section in bold is exactly what's happening here:

    int field_3_value = *(field_2_ptr + 1);
    

    Since this dereferences a pointer to one past the element of (essentially) a 1-element array, this triggers undefined behavior. That &my_struct.field_2 + 1 == &my_struct.field_3 would evaluate to true doesn't matter.

    For the same reason, this is also undefined behavior:

    int x[2][2] = {{1,2},{3,4}};
    int y = x[0][2];
    

    While doing the above would probably work, there's no guarantee of that. Modern compilers optimize aggressively, and they can and do assume no undefined behavior exists and exploit that assumption.

    I've seen cases where, given pointers a and b which point to adjacent memory, (uintptr_t)(a+1) == (uintptr_t)b is true but a+1 == b is false.