Search code examples
c++gccclanglanguage-lawyer

zero-init in gcc vs default-init in clang?


I came across unexpected diffs between gcc and clang while value-initializing objects, and suspect a bug (or two).

  1. Setup 1:
struct A {
    A() {}
    int x;
};

struct B : A {
    int y;
};

int main() {
...
 B b {};  // How should b.x be initialized?
...
}

gcc makes B b2 {} zero-initialize A, and clang makes it default initialize (doesn't touch x): https://godbolt.org/z/8znhr41ro

Now for spelunking the standard to understand who is right. The value-initialization clause says:

9 To value-initialize an object of type T means:

(9.1) if T is a (possibly cv-qualified) class type ([class]), then

(9.1.1) if T has either no default constructor ([class.default.ctor]) or a default constructor that is user-provided or deleted, then the object is default-initialized;

(9.1.2) otherwise, the object is zero-initialized and the semantic constraints for default-initialization are checked, and if T has a non-trivial default constructor, the object is default-initialized;

(9.2) if T is an array type, then each element is value-initialized;

(9.3) otherwise, the object is zero-initialized.

While 9.1.2 is rather poorly phrased, I think the item relevant to this code is 9.3 - "the object is zero initialized". The zero-initialization clause, a few paragraphs prior, does define handling of base classes:

6 To zero-initialize an object or reference of type T means:

(6.1) if T is a scalar type ([basic.types.general]), the object is initialized to the value obtained by converting the integer literal 0 (zero) to T;

(6.2) if T is a (possibly cv-qualified) non-union class type, its padding bits ([basic.types.general]) are initialized to zero bits and each non-static data member, each non-virtual base class subobject, and, if the object is not a base class subobject, each virtual base class subobject is zero-initialized;

...

So I think gcc is right here, and this is a clang bug. *

  1. Setup 2 - comment out B's int y member:
struct A {
    A() {}
    int x;
};

struct B : A {
//  int y;
};

This should have been the same to case 1, but gcc's behavior changes: https://godbolt.org/z/Pvnh556de . Here both gcc and clang default-initialize (and not zero-initialize) A, and I suspect this might be a bug in both.

Are any of these indeed bugs to report? Or am I missing something?


  • BTW I'm uneasy with the standard here. The user expressed their intent about what needs to happen when A is instantiated: "Do nothing". I'd say this intent should carry through to A's embedded as subobjects (members or bases). Zero-initializing A might even not make any sense. But that's a different story that I'm happy to postpone to another occasion.

Solution

  • I will assume C++17 or later as per the comments in the question:

    B b {}; is syntactically direct-list-initialization by empty initializer list, list-initialization referring to the use of a braced initializer list. The rules for this are specified in [dcl.init.list].

    B is an aggregate class (since C++17). As a consequence any list-initialization results semantically in aggregate-initialization, not value-initialization.

    In contrast to initialization with empty parentheses which would be value-initialization (grammar ambiguity making it unusable as declaration initializer aside) and declaration without initializer which would be default-initialization.

    Given that no aggregate element has any explicit initializer in the aggregate-initialization, each element is initialized as if by = {}, i.e. copy-list-initialization by an empty initializer list.

    The consequence is that B::y is zero-initialized (as in int y = {};) which I won't go into detail on.

    A is not an aggregate class because it has a user-provided constructor and therefore initialization with = {} falls through the in the list-initialization rules until [dcl.init.list]/3.5 which states that the subobject will be value-initialized.

    And per your quotes, because A does have a default constructor that is user-provided, the subobject is default-initialized by (9.1.1). To default-initialize a class type does not imply any zero-initialization, but only initialization by a call to the default constructor, which in your case doesn't initialize B::A::x.

    So, B::A::x has an indeterminate value.

    Removing the B::y member doesn't change anything about this. However, your approach to determining whether or not x has an indeterminate value is flawed. Trying to read an indeterminate int causes undefined behavior and the compiler doesn't have to provide any values consistent with what had been stored at the same memory location before.

    So both compiler behave correctly in all cases.


    Had you initialized with parentheses, e.g.

    B b = B();
    

    then the whole B object would have been value-initialized which would have implied zero-initialization of all of B, which recursively would have zero-initialized B::A::x. In that case all compilers would be required to print 0 in your test cases.


    Regarding your last point: Even though the members will not be initialized per the above, it is unobservable to a program whether or not an initialization with zero happened regardless, since any attempt to read the value would be UB. Therefore a compiler is free to still do the zero-initialization regardless under the as-if rule.


    Regarding previous C++ versions without going into too much detail:

    In C++11 and C++14 zero-initialization of x is guaranteed with or without y because B isn't an aggregate class in C++14 and {} therefore causes value-initialization of the whole B object which means zero-initialization for lack of a user-provided/deleted constructor, which recursively implies zero-initialization of all subobjects, including x. (Per the value-initialization rules this is then still (usually) followed by a default constructor call that may replace the zero-initialization values.)

    In C++98 and C++03 compilation will fail because B is not an aggregate class and therefore initialization with {} is not permitted syntax.

    In C++98 and C++03 the rules for value-initialization are also different and wouldn't cause recursive zero-initialization anyway. This was however changed to the current behavior by CWG 178 and CWG 543 which according to cppreference should be considered defect reports against C++98 as well (I have no official reference on that).