Search code examples
c++lifetimepreservetemporaries

Best way to preserve the life... of temporaries from a C++ expression


Consider a binary operation X with an overloaded right-associative C++ operator: a+=b+=c --> Y{a, X{b,c}}

It is possible to "freeze" all the information regarding the operands from expression in some kind of syntax tree (combinations of X & Y objects) and access it later. (this is not the question)

struct X{Operand& l; Operand& r; /*...*/};
struct Y{Operand& l; X r; /*...*/};
Operand a, b, c;
auto x = Y{a, X{b,c}};
//access members of x...

If I store Y::r as value (like above) then copy or at least move will be involved. If I store Y::r as rvalue reference (e.g. X&& r;) then it will refer a temporary which will be destroyed when expression ends, leaving me with a dangling reference.

What is the best way to catch it or prevent this automatic destruction in order to use the already constructed expression multiple times in multiple places?

  • By catching I mean prolonging somehow its life, not manually assigning it to a local variable (works but doesn't scale well! think about the case: a+=b+=…+=z)
  • I know moving is cheaper than copying... but doing nothing is even better (the object is there, already constructed)
  • you can pass the expression as rvalue reference argument to a function or lambda and access its members while inside that function/lambda... but you cannot reuse it (outside)! You have to recreate it every time (somebody called this approach "Lambda of doom", maybe there are other drawbacks)

Here is a test program (live at https://godbolt.org/z/7f78T4zn9):

#include <assert.h>
#include <cstdio>
#include <utility>

#ifndef __FUNCSIG__
#   define __FUNCSIG__ __PRETTY_FUNCTION__
#endif

template<typename L,typename R> struct X{
    L& l; R& r;
    X(L& l, R& r): l{l}, r{r} {printf("X{this=%p &l=%p &r=%p} %s\n", this, &this->l, &this->r, __FUNCSIG__);};
    ~X(){printf("X{this=%p} %s\n", this, __FUNCSIG__);};

    X(const X& other) noexcept      = delete;
    X(X&& other) noexcept           = delete;
    X& operator=(const X&) noexcept = delete;
    X& operator=(X&&) noexcept      = delete;
};
template<typename L,typename R> struct Y{
    L& l; R&& r;
    Y(L& l, R&& r): l{l}, r{std::forward<R>(r)} {
        printf("Y{this=%p &l=%p r=%p} %s\n", this, &this->l, &this->r, __FUNCSIG__);
        assert(&this->r == &r);
    };
    ~Y(){printf("Y{this=%p} %s\n", this, __FUNCSIG__);};
    void func(){printf("Y{this=%p} &r=%p ... ALREADY DELETED! %s\n", this, &r, __FUNCSIG__);};
};

struct Operand{
    Operand(){printf("Operand{this=%p} %s\n", this, __FUNCSIG__);}
    ~Operand(){printf("Operand{this=%p} %s\n", this, __FUNCSIG__);}
};

//================================================================
int main(){
    Operand a, b, c;
    printf("---- 1 expression with temporaries\n");
    auto y = Y{a, X{b,c}};//this will come from an overloaded right-associative C++ operator, like: a+=b+=c
    printf("---- 2 immediately after expression... but already too late!\n");//at this point the temporary X obj is already deleted
    y.func();//access members...
    printf("---- 3\n");
    return 0;
}

Here is an output sample where you can see the address of X temporary object going into Y::r ... and destroyed immediately after, before having a chance to catch it:

---- 1 expression with temporaries
X{this=0x7ffea39e5860 &l=0x7ffea39e584e &r=0x7ffea39e584f} X::X(Operand&, Operand&)
Y{this=0x7ffea39e5850 &l=0x7ffea39e584d r=0x7ffea39e5860} Y::Y(Operand&, X&&)
X{this=0x7ffea39e5860} X::~X()
---- 2 immediately after expression... but already too late!

Solution

  • There is no way of extending the life of temporaries in the way you wish.

    There's a few ways temporary lives can be extended. Most of them are not helpful. For example, a temporary used in the initialization of a member during a constructor persists until the end of the constructor. This can be useful in one "ply" of such an expression tree, but doesn't help for two.

    One interesting way a temporary's life can be extended is to become the subject of a reference.

    {
        const std::string& x = std::string("Hello") + " World";
        foo();
        std::cout << x << std::endl; // Yep!  Still "Hello World!"
    }
    

    That will persist until x goes out of scope. But it wont do anything to extend the lives of the other temporaries. "Hello" will still get destroyed at the end of that line, even if "Hello world" lives on. And for your particular goal you need "Hello" as well.

    At this point, can you tell I've been frustrated by this problem before?

    There are two approaches that I've found which are consistent.

    • Manage your tree by copy and move, such that the final templated expression truly contains the objects (this is the answer you did not want. Sorry)
    • Manage your tree by clever references to avoid the copies. Then make it impossible to assign a local variable to hold onto it by deleting constructors. Then use the operand in only on expression (this is the other answer you did not want, as it leads itself to the lambda tricks you mention).
      • And it gets totally broken by someone who knows you can assign the value into a new local reference (because the non-root temporary nodes vanish). However, maybe this is acceptable. Those people know who they are, and they deserve all the trouble they get for trying to be special (not that I was one of said people...)

    I've done both approaches myself. I made a JSON engine with temporary variables that, when compiled with a half decent g++ or Visual Studio actually compiled down to the minimum number of pre-compiled stores into the stack needed to create my data structures. It was glorious (and almost bug free...). And I've built the boring "just copy the data" structures.

    What have I found? In my experience, the corner where this kind of shenanigans pays off is very small you need:

    • A hyper high-performance situation where the cost of these constructors and destructors are not trivial.
    • There's a reason for an expression tree structure in the first place, like DAG transformations, which couldn't be done with a simple lambda
    • The code calling this library needs to look so clean that you can't afford to allocate your own local variables for the leaf nodes (thus sidestepping the only really unshakable case where you can't just copy things at the last moment).
    • You can't rely on the optimizer to optimize your code.

    Usually one of these three cases gives. In particular, I notice both the STL and Boost have a tendency to go the copy-it-all approach. STL functions copy by default, and provide std::ref for when you want to get yourself into a sketchy situation in return for performance. Boost has quite a few expression trees like this. To the best of my knowledge, they all rely on copy-it-all. I know Boost.Phoenix does (Phoenix is basically the completed version of your original example), and Boost.Spirit does as well.

    Both of those examples follow a pattern that I think you have to follow: the root node "owns" its descendants, either at compile time with clever templates which have the operands as member variables (rather than references to said operands, a. la. Phoenix), or at run time (with pointers and heap allocations).

    Also, consider that your code becomes rigidly dependent on a perfectly to-spec C++ compiler. I don't think those actually exist, despite the best efforts of compiler developers who are better than I. You're living in the tiny corner where "But it's spec compliant" can get refuted with "But I can't compile it on any modern compiler!"

    I love the creativity. And please, if you figure out how to do what you want, please loudly comment on my answer so that I can learn from your cleverness. But from my own efforts to dredge the C++ spec for ways to do exactly what you seek, I'm pretty sure it isn't there.