I'm a beginner to 32 bit assembly and I tried to compile a simple C program into Assembly. I understand most of it except when it uses GOTOFF.
.file "main.c"
.text
.section .rodata
.LC0:
.string "Hello world"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
leal 4(%esp), %ecx
.cfi_def_cfa 1, 0
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
movl %esp, %ebp
pushl %ebx
pushl %ecx
.cfi_escape 0xf,0x3,0x75,0x78,0x6
.cfi_escape 0x10,0x3,0x2,0x75,0x7c
call __x86.get_pc_thunk.ax
addl $_GLOBAL_OFFSET_TABLE_, %eax
subl $12, %esp
leal .LC0@GOTOFF(%eax), %edx # <- Here
pushl %edx
movl %eax, %ebx
call puts@PLT
addl $16, %esp
movl $0, %eax
leal -8(%ebp), %esp
popl %ecx
.cfi_restore 1
.cfi_def_cfa 1, 0
popl %ebx
.cfi_restore 3
popl %ebp
.cfi_restore 5
leal -4(%ecx), %esp
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.section .text.__x86.get_pc_thunk.ax,"axG",@progbits,__x86.get_pc_thunk.ax,comdat
.globl __x86.get_pc_thunk.ax
.hidden __x86.get_pc_thunk.ax
.type __x86.get_pc_thunk.ax, @function
__x86.get_pc_thunk.ax:
.LFB1:
.cfi_startproc
movl (%esp), %eax
ret
.cfi_endproc
.LFE1:
.ident "GCC: (GNU) 9.2.0"
.section .note.GNU-stack,"",@progbits
Why does it use GOTOFF? Isn't the address of GOT already loaded in %eax? What is the difference between GOT and GOTOFF?
symbol@GOTOFF addresses the variable itself, relative to the GOT base (as a convenient but arbitrary choice of anchor). lea
of that gives you symbol address, mov
would give you data at the symbol. (The first few bytes of the string in this case.)
symbol@GOT gives you offset (within the GOT) of the GOT entry, for that symbol. A mov
load from there gives you the address of the symbol. (GOT entries are filled in by the dynamic linker).
Why use the Global Offset Table for symbols defined in the shared library itself? has an example of accessing an extern
variable that does result in getting its address from the GOT and then dereferencing that.
BTW, this is position-independent code. Your GCC is configured that way by default. If you used -fno-pie -no-pie
to make a traditional position-dependent executable, you'd just get a normal efficient pushl $.LC0
. (32-bit is missing RIP-relative addressing so it's quite inefficient.)
In a non-PIE (or in 64-bit PIE), the GOT barely gets used at all. The main executable defines space for symbols so it can access them without going through the GOT. libc code uses the GOT anyway (mostly because of symbol interposition in 64-bit code) so letting the main executable provide the symbol doesn't cost anything and makes the non-PIE executable faster.
We can get a non-PIE executable to use the GOT directly for shared library function addresses with -fno-plt
, instead of calling into the PLT and having it use the GOT.
#include <stdio.h>
void foo() { putchar('\n'); }
gcc9.2 -O3 -m32 -fno-plt
on Godbolt (-fno-pie
is the default on the Godbolt compiler explorer, unlike your system.)
foo():
sub esp, 20 # gcc loves to waste an extra 16 bytes of stack
push DWORD PTR stdout # [disp32] absolute address
push 10
call [DWORD PTR _IO_putc@GOT]
add esp, 28
ret
Both push
and call
have a memory operand using a 32-bit absolute address. push
is loading the FILE*
value of stdout
from a known (link-time-constant) address. (There isn't a text relocation for it.)
call
is loading the function pointer saved by the dynamic linker from the GOT. (And loading it directly into EIP.)