Search code examples
c++compiler-constructionstacktemporaries

How does the compiler determine the needed stack size for a function with compiler generated temporaries?


Consider following code:

class cFoo {
    private:
        int m1;
        char m2;
    public:
        int doSomething1();
        int doSomething2();
        int doSomething3();
}

class cBar {
    private:
        cFoo mFoo;
    public:
        cFoo getFoo(){ return mFoo; }
}

void some_function_in_the_callstack_hierarchy(cBar aBar) {
    int test1 = aBar.getFoo().doSomething1();
    int test2 = aBar.getFoo().doSomething2();
    ...
}

In the line where getFoo() is called the compiler will generate a temporary object of cFoo, to be able to call doSomething1(). Does the compiler reuse the stack memory which is used for these temporary objects? How many stack memory will the call of "some_function_in_the_callstack_hierarchy" reservate? Does it reservate memory for every generated temporary?

My guess was that the compiler only reserve memory for one object of cFoo and will reuse the memory for different calls, but if I add

    int test3 = aBar.getFoo().doSomething3();

I can see that the needed stack size for "some_function_in_the_callstack_hierarchy" is way more and its not only because of the additional local int variable.

On the other hand if i then replace

cFoo getFoo(){ return mFoo; }

with a reference (Only for testing purpose, because returning a reference to a private member is not good)

const cFoo& getFoo(){ return mFoo; }

it needs way less stack memory, than the size of one cFoo.

So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function. But this would be very inefficient. Can someone explain this?


Solution

  • The optimizing compiler is transforming your source code into some internal representation, and normalizing it.

    With free software compilers (like GCC & Clang/LLVM), you are able to look into that internal representation (at the very least by patching the compiler code or running it in some debugger).

    BTW, sometimes, temporary values do not even need any stack space, e.g. because they have been optimized, or because they can sit in registers. And quite often they would reuse some unneeded slot in the current call frame. Also (particularly in C++) a lot of (small) functions are inlined -like your getFoo probably is- (so they don't have any call frame themselves). Recent GCC are even sometimes able of tail-call optimizations (essentially, reusing the caller's call frame).

    If you compile with GCC (i.e. g++) I would suggest to play with optimization options and developer options (and some others). Perhaps use also -Wstack-usage=48 (or some other value, in bytes per call frame) and/or -fstack-usage

    First, if you can read assembler code, compile yourcode.cc with g++ -S -fverbose-asm -O yourcode.cc and look into the emitted yourcode.s

    (don't forget to play with optimization flags, so replace -O with -O2 or -O3 ....)

    Then, if you are more curious about how the compiler is optimizing, try g++ -O -fdump-tree-all -c yourcode.cc and you'll get a lot of so called "dump files" which contain a partial textual rendering of internal representations relevant to GCC.

    If you are even more curious, look into my GCC MELT and notably its documentation page (which contains a lot of slides & references).

    So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function.

    Certainly not, in the general case (and of course assuming you enable some optimizations). And even if some space is reserved, it would be very quickly reused.

    BTW: notice that the C++11 standard does not speak of stack. One could imagine some C++ program compiled without using any stack (e.g. a whole program optimization detecting a program without recursion whose stack space and layout could be optimized to avoid any stack. I don't know any such compiler, but I do know that compilers can be quite clever....)