EDIT at the bottom
I found a behavior with the constructor of pair
which I don't fully understand.
So I try to initialize a pair with rvalues, with this code:
pair<vector<int> &&, int> a([]() -> vector<int> {
vector<int> b{1};
cout << &b << ' ' << b[0] << '\n';
return b;
}(), 0);
cout << &a.first << ' ' << a.first[0] << '\n';
The output is
0x62fdf0 1
0x62fdf0 14162480
so apparently a.first
is garbage.
Then I find the constructor of pair
, online, to be this:
pair (const first_type& a, const second_type& b);
template<class U, class V> pair (U&& a, V&& b);
So I guess the second one is being used? Then I tried removing the &&
:
pair<vector<int>, int> a([]() -> vector<int> {
vector<int> b{1};
cout << &b << ' ' << b[0] << '\n';
return b;
}(), 0);
cout << &a.first << ' ' << a.first[0] << '\n';
But now a.first
has different address than b
:
0x62fdf0 1
0x62fdc0 1
But if I remove the outer useless pair
, the code will work (i.e. same address and same value). Why? And how can I make the pair
works?
EDIT
After trying to understand comments by @cigien, I reduced the old code heavily to
pair<int &&, int> a(1, 2);
cout << a.first << '\n';
which prompted me with the error warning: '<anonymous>' is used uninitialized in this function [-Wuninitialized]
. Then I read this post about the same warning.
I attempted to remove the warning by reading the definition of the pair
constructor here. Which leads to the below (more vanilla) code:
struct s_t {
int && first = 0;
} var;
int main() {
var.first = 2;
cout << var.first << '\n';
cout << var.first << '\n';
}
The warning is gone by this code. But the output of this code is:
2
0
which honestly left me zero clue. Any help is appreciated.
You are creating temporary objects, making (rvalue) references to them, and then trying to use those references after the underlying object is gone.
In the first example, []() -> vector<int> { ... }()
is a prvalue (instructions for making an object) expression of type vector<int>
. The pair
constructor requires a reference to a real object, not instructions for making an object, so before the constructor is called, the lambda is called and a temporary vector<int>
is created. Then the constructor stores a reference to that object into the pair
, and then you destroy that object after the constructor has finished and control has returned to you. Therefore the first element of the pair is now garbage.
In the second, the constructor acquires a reference to the temporary vector<int>
(and the temporary is created in the same way), but this time, instead of storing the reference (i.e. constructing a vector<int>&&
from the vector<int>&&
), it constructs a vector<int>
from the reference. Constructing a vector<int>
from a vector<int>&&
is handled by a certain overload of vector<int>
's constructor (this overload has the special name "move constructor"). This constructor takes the handle to the heap allocation inside the temporary object and simply gives ownership to the object inside the pair. Now, once the temporary is destroyed, the actual data survives, but the handle (i.e. the vector<int>
object) itself is in a different place.
Your final example has the same problem as the first. When var
is initialized, 0
(an int
prvalue) does not refer to a real object, so you can't just bind a reference var.first
to it. 0
is materialized into a temporary object, var.first
is made to refer to that object, and then the temporary is destroyed. var.first
is now dangling and you cannot do anything with it, so the rest of the code is UB and there's not much point in figuring out what is going wrong "under the hood" (though it certainly seems interesting). In fact, it is so useless to initialize a reference member from a temporary like this that the initialization of var
is illegal (should be a "hard" compile error) as of Defect Report 1696. (But old compilers may accept it, or even current ones may (incorrectly) demote it to a warning.)
Now, if you really really want to construct the vector<int>
"directly" inside the pair
without calling its move constructor, you can do something like this where you create a object that suspends the construction of the vector<int>
until the moment that the field of the pair
is constructed.
template<typename F>
struct initializing {
F self;
initializing(F self) : self(std::move(self)) { }
operator decltype(auto)() { return self(); }
};
and now say
pair<vector<int>, int> a(
initializing([]() -> vector<int> {
vector<int> b{1};
cout << &b << ' ' << b[0] << '\n';
return b;
}), 0);
Now, the lambda is a prvalue of an anonymous closure type. It materializes into a temporary object, and a initializing<that_anonymous_type>
prvalue is produced by passing a reference to that temporary to initializing
's constructor. Now, that prvalue is also materialized into a temporary, which calls the constructor and constructs the self
field of the initializing
by calling the closure type's move constructor (in this case a no-op) with the reference to the temporary closure type object. Then the reference to that temporary initializing
object is passed to pair
's constructor, which constructs a vector<int>
out of the initializing
. This calls the conversion function defined in initializing
with the result object set to the uninitialized field of the pair. The conversion function then calls the lambda with the result object also set to the field of the pair. The lambda then initializes that result object (first element of the pair) directly by moving b
at the return
(or, when NRVO is applied, b
becomes a name for the result object/first element of the pair and is initialized at the top of the lambda). This way, no vector
constructors is called except for the one or two "visibly" written in the lambda, and the addresses of b
and a.first
will be the same if NRVO is applied. A slightly more dangerous version is
template<typename F>
struct initializing {
F &&self;
initializing(F &&self) : self(std::forward<F>(self)) { }
operator decltype(auto)() { return std::forward<F>(self)(); }
};
which also eliminates the move of object materialized from the lambda.