Search code examples
ccompiler-construction

Why is this compiler output wrong when setjmp is involved?


Given this C code

#include <stdio.h>
#include <setjmp.h>
void foo(int x) {
  jmp_buf env;
  if (setjmp(env) == 0) {
    printf("%d\n", 23);
    longjmp(env, 1);
  } else {
    printf("%d\n", x);
  }
}

The result should be that it prints 23 and then x and it should be all well defined.

But lets say the compiler has no knowledge that setjmp/longjmp are special functions and it generates the following code:

;function foo
;r0 : int x

foo:
    sub sp, sp, #sizeof(jmp_buf) ; reserve space for env
    push r0         ; save x for later
    add r0, sp, #4  ; load address of env
    call setjmp
    pop r1          ; restore SP, move x to r1 <<== corrupt after jongjmp
    cmp r0, #0      ; if (setjmp(env) == 0)
    bne 1f
    lea r0, "%d\n"  ; printf("%d\n", 23)
    mov r1, #23
    call printf
    mov r0, sp  ; load address of env
    mov r1, #1
    call longjmp
    b 2f
1:
    lea r0, "%d\n"      ; printf("%\dn", x), x already in r1
    call printf
2:
    add sp, sizeof(jmp_buf)
    ret

This will print 23 as expected but then it prints the retrun address of the longjmp call, i.e. the address of the 1 label.

The variable x is only temporarily stored on the stack to preserve it across the setjmp function call (r0, being an argument register, is caller saved). I think that is a perfectly valid thing for a compiler to do. But since setjmp returns twice this corrupts the variable while the C standard say it should not.


Solution

  • setjmp is a macro, not a function, which is a recognition by the standard that on certain implementations it might require features not available to normal functions.

    The standard explicitly allows the macro to simply expand to a function of the same name, for the case of implementations in which it can be implemented with a function using standard call semantics. However, if an application program attempts to bypass the macro, either with #undef or by using (setjmp)(jmpbuf), it incurs Undefined Behaviour. This is the opposite of normal standard library functions which can also be implemented as macros as well as functions, but which can be accessed using the above techniques to avoid macro expansion.

    Also, the fact that setjmp is specified to be a macro means that &setbuf is also undefined behaviour. In fact, the standard only allows a call to setbuf in two contexts:

    1. As a complete expression statement, possibly with an explicit cast to void

    2. In the condition of an if or loop statement, and only when the condition is

      • The setjmp call itself

      • The operator ! with the setjmp call as its argument

      • A comparison between the setjmp call and an integer constant.

    In other words, the value of the call to setjmp cannot be saved or participate in arithmetic, and no other computation can be performed inside the sequence points which surround the context of the call.

    So the standards gives an implementation has lots of latitude for the implementation of setjmp.