Is there a C struct "packed aligned" option?

Compiler: I'm personally using gcc, but the question is conceptual. I'm interested in options for any compiler.

Is there a way to tell the C compiler to make struct B have the same size as struct AB without sacrificing alignment?

It should also respect alignment when put into an array.

I've tried using __attribute__ ((__packed__, aligned(4))) but this seems to be the same as not using any attributes (the size is still rounded up to the alignment).

I don't understand how this isn't an obvious improvement: in certain cases it could save quite a bit of space for structs without sacrificing performance (or ergonomics) on field lookups. All it would require for the compiler is to store a (size, alignment) for each struct.

#include <stdio.h>
#include <stdint.h>

struct A { // total size: 6 bytes (actually 8)
  uint32_t a0; // 4 bytes
  uint16_t a1; // 2 bytes
};

struct B { // total size: 8 bytes (actually 12)
  struct A b0; // 6 bytes
  uint16_t b1; // 2 bytes
};

struct AB { // total size: 8 bytes (actually 8)
  uint32_t a0; // 4 bytes
  uint16_t a1; // 2 bytes
  uint16_t b1; // 2 bytes
};

// Kind of works, but sacrifices alignment
struct __attribute__ ((__packed__)) Ap {
  uint32_t a0; // 4 bytes
  uint8_t a1;  // 1 byte
};
struct __attribute__ ((__packed__)) Bp {
  struct Ap b0;
  uint16_t  b1;
};

int main() {
  printf("sizeof(A)  = %u\n", sizeof(struct A));  // 8  (not 6)
  printf("sizeof(B)  = %u\n", sizeof(struct B));  // 12 (not 8)
  printf("sizeof(AB) = %u\n", sizeof(struct AB)); // 8  (same as desired)
  printf("sizeof(Ap) = %u\n", sizeof(struct Ap)); // 5  (as desired)
  printf("sizeof(Bp) = %u\n", sizeof(struct Bp)); // 7  (not 8)
  return 0;
}

The way I've been actually doing this:

#define STRUCT_A  \
  uint32_t a0; \
  uint8_t a1

struct AB {
  STRUCT_A;    // 6 bytes
  uint16_t b1; // 2 bytes
};

Solution

If I correctly understand what you're wishing for, it's impossible. It's not merely a compiler or ABI restriction; it would actually be inconsistent with the following fundamental principles of the C language.

1. In an array of type T, successive elements are at intervals of sizeof(T) bytes.

This guarantee is what allows you to correctly implement "generic" array processing functions like qsort. If for instance we want a function that copies element 3 of an array to element 4, then the language promises that the following must work:

void copy_3_to_4(void *arr, size_t elem_size) {
    unsigned char *c_arr = arr; // convenience to minimize casting
    for (size_t i = 0; i < elem_size; i++) {
        c_arr[4*elem_size + i] = c_arr[3*elem_size+i];
    }
}

struct foo { ... };
struct foo my_array[100];
copy_3_to_4(my_array, sizeof(struct foo)); // equivalent to my_array[4] = my_array[3]

From this it follows that if an object T has a required alignment of k bytes, then sizeof(T) must necessarily be a multiple of k. Otherwise, the elements of a large enough array could not all be correctly aligned. So your proposed notion of an object of size 6 and alignment 4 cannot be consistent with this principle.

So for the struct A in your example, with a uint32_t and a uint16_t member: if we suppose that, as on most common platforms, uint32_t requires 4-byte alignment, then struct A requires the same, and so sizeof(struct A) can't be 6; it has to be 8. (Or, in principle, 12, 16, etc, but that would be weird.) The 2 bytes of padding is unavoidable.

2. Distinct objects cannot overlap.

And here "overlap" is defined in terms of sizeof. The sizeof(T) bytes starting at address &foo cannot coincide with any of the corresponding bytes of any other object bar. This includes any padding bytes that either object may contain. And distinct members of a struct (other than bitfields) are distinct objects for this purpose.

For a struct, this means that an object which modifies a struct is allowed to freely modify its padding bytes, if the compiler finds it convenient to do so. With your struct A and struct B examples, we could imagine:

void copy(struct A *dst, const struct A *src) {
    *dst = *src;
}

The compiler is allowed to compile this into a single 64-bit load/store pair, which copies not only the 6 bytes of actual data but also the 2 bytes of padding. If it couldn't do that, it would have to compile it as a 32-bit copy plus a 16-bit copy, which would be less efficient.

Perhaps an even better example is that you are also allowed to copy a struct A by doing memcpy(&y, &x, sizeof(struct A)), which will more obviously copy 8 bytes, or a byte-by-byte copy of sizeof(struct A) bytes as in copy_3_to_4 above.

And it is legal to do:

struct A foo = { 42 };
struct B bar;
bar.b1 = 17;
copy(&bar.b0, &foo);
assert(bar.b1 == 17); // should be unchanged

If you wanted to have sizeof(struct B) == 8, then the b1 member would have to exist within the padding of the b0 member. So if copy(&bar.b0, &foo) does a 64-bit copy then it would overwrite it. We can't require that copy handle this case specially, because it could be compiled in an entirely separate file, and has no way of knowing whether its argument exists within some larger object. And we also can't tell the programmer they can't do copy(&bar.b0, &foo); the object bar.b0 is a bona fide object of type struct A and is entitled to all the rights and privileges of any object of that type.

So the only way out of this dilemma is for sizeof(struct B) to be larger than 8. And since its required alignment is still 4 (as inherited from struct A, as inherited from uint32_t), then necessarily sizeof(struct B) must be 12 or more.