Search code examples
linkeroffsetelfrelocation

Procedure Linkage Table and Call Relative


I am curious how programs like readelf, objdump and gdb know what to display next to callq instructions. Since the program has yet to run how do they know how far to 'fall through' the .plt? Do they guess based on the arguments passed to it? Or do they actually do a mock run of the program to find out?

For example:

  400ca4:       e8 e7 fb ff ff          callq  400890 <printf@plt>
  400ca9:       48 8b 85 28 ff ff ff    mov    -0xd8(%rbp),%rax

The above code knows to go to printf() in the .plt at 0x400890:

0000000000400890 <printf@plt>:
  400890:       ff 25 ba 17 20 00       jmpq   *0x2017ba(%rip)        # 602050 <_GLOBAL_OFFSET_TA$
  400896:       68 07 00 00 00          pushq  $0x7
  40089b:       e9 70 ff ff ff          jmpq   400810 <_init+0x20>

This is just output from objdump -d so I'm not sure how the program knows it wants printf. The only correlation I can see is the relocation index (pushq $0x7) and the section .dynsym, though it is one value off because it starts at 0:

8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)

Another thing that confuses me is the reference to the GOT in the .plt entry (#602050). I see from readelf that it is part of .got.plt based on the address range, but how do these programs determine the value before the program is run?

[23] .got.plt          PROGBITS         0000000000602000  00002000
       00000000000000b8  0000000000000008  WA       0     0     8

** Edit **

Symbol table '.dynsym' contains 22 entries:

       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND free@GLIBC_2.2.5 (2)
         2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND putchar@GLIBC_2.2.5 (2)
         3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strncpy@GLIBC_2.2.5 (2)
         4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@GLIBC_2.2.5 (2)
         5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fclose@GLIBC_2.2.5 (2)
         6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strlen@GLIBC_2.2.5 (2)
         7: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __stack_chk_fail@GLIBC_2.4 (3)
         8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
         9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
        10: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND ftell@GLIBC_2.2.5 (2)
        11: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
        12: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND malloc@GLIBC_2.2.5 (2)
        13: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND _IO_getc@GLIBC_2.2.5 (2)
        14: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fseek@GLIBC_2.2.5 (2)
        15: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fopen@GLIBC_2.2.5 (2)
        16: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND perror@GLIBC_2.2.5 (2)
        17: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getopt@GLIBC_2.2.5 (2)
        18: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND atoi@GLIBC_2.2.5 (2)
        19: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND exit@GLIBC_2.2.5 (2)
        20: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fwrite@GLIBC_2.2.5 (2)
        21: 0000000000400a4d    34 FUNC    GLOBAL DEFAULT   13 err

Solution

  • A little of this is going off of memory, but let's see if I can't help you out...

    As to your first question, there's a chain of things that link together. I can't guarantee this is how these tools are doing things, but just to show that there is a way.

    1. The PLT has a 1-to-1 correspondence (except for PLT[0], which is special) with a .rel(a).plt section. This section contains relocations for the PLT entries.
    2. Each .rel(a).plt entry has an info field which has a symbol table index, e.g. into .dynsym.
    3. Each symbol table entry has an offset into the string table (e.g. .dynstr) for its name. This offset is a byte offset starting from the beginning of the string section.

    So as you can see, you can follow the PLT to the rel(a).plt, to the symbol table, to the string table, where you'll find "printf."

    To answer your second question, take a look at the program headers (readelf -Wl <program>), and you'll see the virtual addresses for the different sections. That's where that address range comes from.