Search code examples
cgccreturnmemory-addressgcc-extensions

GCC - modify where execution continues after function return


Is it possible to do something like this in GCC?

void foo() {
    if (something()) returnSomewhereElse;
    else return;
}

void bar() {
    foo();
    return; // something failed, no point of continuing
    somewhereElse:
    // execution resumes here if something succeeds
    // ...
}
  • can this be achieved in a portable way, using C and GCC extensions, without using platform specific assembly?
  • the stack state will not change between the normal and altered return points, so is it possible to reuse the code that restores the stack and registers state from the regular return?
  • considering that the function may or may not be inlined, if called it must alter the return address, if inlined it must only alter the code path but not the current function return address, as that would break the code
  • the alternate return point does not need to be a label, but I am hoping GCC's address of label extension can come in handy in this situation

Just to clarify the intent - it is about error handling. This example is a minimal one, just to illustrate things. I intend to use that in a much deeper context, to stop execution if an error occurs. I also assume the state doesn't change, which I may be wrong about, because there are no extra local variables added between the two return points, so I was hoping the code generated by the compiler to do that on foo's return can be reused for that and save the overhead of using longjmp, setting and passing the jump buffers.

The example "does make sense" because its intent is to show what I want to achieve, not why and how it would make sense in actual code.

Why is your idea simpler of better then simply returning a value from foo() and having bar() either return or execute somewhereElse: conditionally?

It is not simpler, and what you suggest is not applicable in practice, only in the context of a trivial example, but it is better because:

1 - it doesn't involve an additional returning of a value

2 - it doesn't involve an additional checking of a value

3 - it doesn't involve an additional jump

I am probably incorrectly assuming the goal should be clear at this point, and after all the clarifications and explanations. The idea is to provide an "escape code path" from a deep call chain without any additional overhead. By reusing the code the compiler has generated to restore the state of the previous call frame and simply modify the instruction at which execution resumes after the function returns. Success skips the "escape code path", the first error that occurs enters it.

if (failure) return; // right into the escape code path
else {
    doMagickHere(); // to skip the escape code path
    return; // skip over the escape code path
}

//...
void bar() {
    some locals;
    foo();
    // enter escape code path here on foo failure so
    destroy(&locals); // cleanup
    return; // and we are done
    skipEscapeCodePath: // resume on foo success
    // escape path was skipped so locals are still valid
}

As for the claims made by Basile Starynkevitch that longjmp is "efficient" and that "Even a billion longjmp remains reasonable" - sizeof(jmp_buf) gives me a hefty 156 bytes, which is apparently the space needed to save pretty much all registers and a bunch of other stuff, so it can later be restored. Those are a lot of operations, and doing that a billion times is far, far outside of my personal understandings of "efficient" and "reasonable". I mean a billion jump buffers themselves are over 145 GIGABYTES of memory alone, and then there is the CPU time overhead as well. Not a whole lot of systems out there which can even afford that kind of "reasonable".


Solution

  • After giving it some extra thought, it is not as simple as it initially appeared. There is one thing that prevents it from working - the code of functions is not context aware - there is no way of knowing the frame it was invoked in, and that has two implications:

    1 - modifying the instruction pointer if not portable is easy enough, as every implementation defines a consistent place for it, it is usually the first thing on the stack, however modifying its value to skip the escape trap will also skip the code which restores the previous frame state, since that code is there, not in the current frame - it can't perform state restoration since it has no information of it, the remedy for that, if an extra check and jump are to be omitted is to duplicate the state restoring code in both locations, unfortunately, this can only be done in assembly

    2 - the amount of instructions which need to be skipped is unknown too, and depends on which is the previous stack frame, depending on the number of locals that need destruction it will vary, it will not be a uniform value, the remedy for that would be to push both the error and success instruction pointers on the stack when the function is invoked, so it can restore one or the other depending on whether an error occurs or not. Unfortunately, that too can only be done in assembly.

    Seems that such a scheme can only be implemented on level compiler, demanding its own calling convention, which pushes two return locations and inserts state restoration code at both. And the potential savings from this approach hardly merit the effort to write a compiler.