Why does [[no_unique_address]] have no effect on public data members?

Consider the following example:

template <class T>
struct A {
    [[no_unique_address]] T t;
    int i;
};

struct B {
    long l;
    int i;
};

class C {
    long l;
    int i;
};

Both GCC and Clang think sizeof(A<B>) is 24, while sizeof(A<C>) is 16. Compiler Explorer

Class template A<T> has T as one of its data members with the [[no_unique_address]] attribute. The only difference between B and C is that B is a struct and C is a class. I can't understand why A<B> and A<C> are not the same size. In other words, why does the compiler embed member i of class template A<T> into the tail padding of C but not into the tail padding of B?

If I modify B's member access to private, both GCC and Clang think the size of A<B> is 16: (Compiler Explorer)

template <class T>
struct A {
    [[no_unique_address]] T t;
    int i;
};

struct B {
private:      // private now
    long l;
    int i;
};

class C {
    long l;
    int i;
};

So it seems that the difference does not come from struct and class, but from the accessibility of data members.

I know it's implementation-defined whether the compiler embeds other members into the tail padding of members with the [[no_unique_address]] attribute, but I guess there is some special rule that causes the same weird behavior in both GCC and Clang. I checked the standard and ABI documentation but couldn't find a description.

Solution

This is because of how layout is specified by the ABI.

https://itanium-cxx-abi.github.io/cxx-abi/abi.html#pod:

The dsize, nvsize, and nvalign of these types are defined to be their ordinary size and alignment. These properties only matter for non-empty class types that are used as base classes. We ignore tail padding for PODs because the standard before the resolution of CWG issue 43 did not allow us to use it for anything else and because it sometimes permits faster copying of the type.

dsize is the "data size" of an object, which means how many bytes are used for the type, not including tail-padding.

This means that for "POD for the purpose of layout" types, the would-be tail-padding is considered part of the data of the type, so cannot be reused to hold the member i (the compiler sees B as an opaque 16 bytes, even though the last 4 bytes are padding)

When you make any member private, it is no longer "POD for the purpose of layout", so the dsize becomes sizeof(long) + sizeof(int) (instead of sizeof(B) = 2 * sizeof(long)), and the next member i can be placed in the tail-padding.

The same issue occurs if you try to make a base-class subobject.

#if BASE_CLASS
template <class T>
struct A : T {
    int i;
};
#else
struct A {
    [[no_unique_address]] T t;
    int i;
};
#endif

struct B {
    long l;
    int i;
};  // dsize = 16, nvalign = 8, sizeof = 16

class C {
    long l;
    int i;
};  // dsize = 12, nvalign = 8, sizeof = 16 (next multiple of 8)

// A<B>: 16 bytes B, 4 bytes for int i, 4 bytes for padding
// dsize = 20, sizeof = 24

// A<C>: 12 bytes C, 4 bytes for int i, no padding
// dsize = 16, sizeof = 16

// (The same numbers for `[[no_unique_address]] T t;`)

CWG43 that is being referenced talks about how memcpy was specified to work on all POD types in C++98. So, a class laying out these types must include the tail-padding internally so that memcpy would not overwrite any data that would otherwise go into the padding. This was fixed in C++03 to specify "any object (other than a base-class subobject) of POD type T", but the layout rules were not changed (presumably so as to not break expectations that you could memcpy POD types, even if they were base classes).

[[no_unique_address]] was specified to have the same behaviour as a base class subobject (as a potentially-overlapping subobject), so this behaviour is inherited.