Search code examples
cgccassemblyosdevgot

undefined reference to `_GLOBAL_OFFSET_TABLE_' in gcc 32-bit code for a trivial function, freestanding OS


I have a small c code file(function.c):

int function()
{
    return 0x1234abce;
}

I am using a 64 bit machine. However, I want to write a small 32 bit OS. I want to compile the code into a 'pure' assembly/binary file.

I compile my code with:

gcc function.c -c -m32 -o file.o -ffreestanding # This gives you the object file

I link it with:

ld -o function.bin -m elf_i386 -Ttext 0x0 --oformat binary function.o

I am getting the following error:

function.o: In function `function':
function.c:(.text+0x9): undefined reference to `_GLOBAL_OFFSET_TABLE_'

Solution

  • You need -fno-pie; the default (in most modern distros) is -fpie: generate code for a position-independent executable. This is a code-gen option separate from the -pie linker option (which gcc also passes by default), and is independent of -ffreestanding. -fpie -ffreestanding implies you want a freestanding PIE that uses a GOT, so that's what GCC targets.

    -fpie only costs a bit of speed in 64-bit code (where RIP-relative addressing is possible) but is quite bad for 32-bit code; compilers get a pointer to the GOT in one of the integer registers (tying up another one of the 8) and access static data relative to that address with [reg + disp32] addressing modes like [eax + foo@GOTOFF]


    With optimization disabled, gcc -fpie -m32 generates the address of the GOT in a register even though the function doesn't access any static data. You'd can see this if you look at your compiler output (with gcc -S instead of -c on the machine you're compiling on).

    On Godbolt we can use -m32 -fpie to give the same effect as a GCC configured with --enable-default-pie:

    # gcc9.2 -O0 -m32 -fpie
    function():
            push    ebp
            mov     ebp, esp                        # frame pointer
            call    __x86.get_pc_thunk.ax
            add     eax, OFFSET FLAT:_GLOBAL_OFFSET_TABLE_  # EAX points to the GOT
            mov     eax, 305441742                  # overwrite with the return value
            pop     ebp
            ret
    
    __x86.get_pc_thunk.ax:          # this is the helper function gcc calls
            mov     eax, DWORD PTR [esp]
            ret
    

    The "thunk" returns its return address. i.e. the address of the instruction after the call. The .ax name means to return in EAX. Modern GCC can choose any register; traditionally the 32-bit PIC base register was always EBX but modern GCC chooses a call-clobbered register when that avoids an extra save/restore of EBX.

    Fun fact: call +0; pop eax would be more efficient, and only 1 byte larger at each call site. You might think that would unbalance the return-address predictor stack, but in fact call +0 is special-cased on most CPUs to not do that. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. (call +0 means the rel32 = 0, so it calls the next instruction. That's not how NASM would interpret that syntax, though.)

    clang doesn't generate a GOT pointer unless it needs one, even at -O0. But it does so with call +0;pop %eax: https://godbolt.org/z/GFY9Ht