Search code examples
c++language-lawyer

A class both derives from and its first member has type deriving from the same base class. Is the class standard-layout?


As far as I know, a property of the standard-layout class is that the address of a standard-layout object is equal to its initial member's. I tested the following code with g++ and clang++, but found that Derived3 is a standard-layout class and &d is not equal to &d.c.

#include <iostream>
using namespace std;

struct Base {};

struct Derived1 : Base
{
  int i;
};

struct Derived3 : Base
{
  Derived1 c;
  int i;
};

int main()
{
  cout << is_standard_layout_v<Derived3> << endl;

  Derived3 d;
  cout << &d << endl;
  cout << &d.c << endl;

  return 0;
}

Solution

  • Following the word of the standard, they are indeed standard-layout types. Going through the points one by one:

    A class S is a standard-layout class if it:

    • has no non-static data members of type non-standard-layout class (or array of such types) or reference, [...]

    int is standard-layout. Derived1 is standard layout, as we'll see.

    • has no non-standard-layout base classes,

    Base is empty, so standard-layout.

    • has at most one base class subobject of any given type,

    Both Derived1 and Derived3 has only a single base Base.

    • has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and

    Meaning, within an inheritance hierarchy, all data members are declared in the same class. This is clearly true for Derived1. This is also true for Derived3 because Derived1 is not in the inheritance hierarchy.

    To make this point clearer, consider a simpler example

    struct B {};
    struct D1 : B {};
    struct D3 : B { D1 c; };
    

    Which also runs into the same address problems as in the question, but clearly fulfills this bullet point.

    • has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows. [Note 2: M(X) is the set of the types of all non-base-class subobjects that can be at a zero offset in X. — end note]
      • If X is a non-union class type with no non-static data members, the set M(X) is empty.
      • If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first non-static data member of X (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0). [...]

    Meaning, M(Derived3) is the set {Derived1, int}, none of which is a base class of Derived3.

    Likewise, M(Derived1) is the set {int}, which is not a base class of Derived1.


    Being standard-layout means the class and its first data member is pointer-interconvertible. To be pedantic, the representation of pointers being different doesn't prove there's a problem, but comparing the results of reinterpret_cast does:

    std::cout << (&d.c == reinterpret_cast<Derived1*>(&d));  // 0 for clang and gcc
    

    Thus the compilers are not technically compliant. However, this is an impossible situation: the Base subobject in Derived1 cannot have the same address as the Base subobject in Derived3, which is why the compilers placed Derived1 at a four byte offset from the start.

    Standard-layout classes have a history of defect reports, and this looks like it should be another one.