Search code examples
cstrict-aliasingtype-punning

type-punning a char array struct member


Consider the following code:

typedef struct { char byte; } byte_t;
typedef struct { char bytes[10]; } blob_t;

int f(void) {
  blob_t a = {0};
  *(byte_t *)a.bytes = (byte_t){10};
  return a.bytes[0];
}

Does this give aliasing problems in the return statement? You do have that a.bytes dereferences a type that does not alias the assignment in patch, but on the other hand, the [0] part dereferences a type that does alias.

I can construct a slightly larger example where gcc -O1 -fstrict-aliasing does make the function return 0, and I'd like to know if this is a gcc bug, and if not, what I can do to avoid this problem (in my real-life example, the assignment happens in a separate function so that both functions look really innocent in isolation).

Here is a longer more complete example for testing:

#include <stdio.h>

typedef struct { char byte; } byte_t;
typedef struct { char bytes[10]; } blob_t;

static char *find(char *buf) {
    for (int i = 0; i < 1; i++) { if (buf[0] == 0) { return buf; }}
    return 0;
}

void patch(char *b) { 
    *(byte_t *) b = (byte_t) {10}; 
}

int main(void) {
    blob_t a = {0};
    char *b = find(a.bytes);
    if (b) {
        patch(b);
    }
    printf("%d\n", a.bytes[0]);
}

Building with gcc -O1 -fstrict-aliasing produces 0


Solution

  • The main issue here is that those two structs are not compatible types. And so there can be various problems with alignment and padding.

    That issue aside, the standard 6.5/7 only allows for this (the "strict aliasing rule"):

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    • a type compatible with the effective type of the object,
      ...
    • an aggregate or union type that includes one of the aforementioned types among its members

    Looking at *(byte_t *)a.bytes, then a.bytes has the effective type char[10]. Each individual member of that array has in turn the effective type char. You de-reference that with byte_t, which is not a compatible struct type nor does it have a char[10] among its members. It does have char though.

    The standard is not exactly clear how to treat an object which effective type is an array. If you read the above part strictly, then your code does indeed violate strict aliasing, because you access a char[10] through a struct which doesn't have a char[10] member. I'd also be a bit concerned about the compiler padding either struct to meet alignment.

    Generally, I'd simply advise against doing fishy things like this. If you need type punning, then use a union. And if you wish to use raw binary data, then use uint8_t instead of the potentially signed & non-portable char.