Search code examples
cexecutableobject-files

Doubts in executable and relocatable object file


I have written a simple Hello World program.

   #include <stdio.h>
    int main() {
    printf("Hello World");
    return 0;
    }

I wanted to understand how the relocatable object file and executable file look like. The object file corresponding to the main function is

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   bf 00 00 00 00          mov    $0x0,%edi
   9:   b8 00 00 00 00          mov    $0x0,%eax
   e:   e8 00 00 00 00          callq  13 <main+0x13>
  13:   b8 00 00 00 00          mov    $0x0,%eax
  18:   c9                      leaveq 
  19:   c3                      retq 

Here the function call for printf is callq 13. One thing i don't understand is why is it 13. That means call the function at adresss 13, right??. 13 has the next instruction, right?? Please explain me what does this mean??

The executable code corresponding to main is

00000000004004cc <main>:
  4004cc:       55                      push   %rbp
  4004cd:       48 89 e5                mov    %rsp,%rbp
  4004d0:       bf dc 05 40 00          mov    $0x4005dc,%edi
  4004d5:       b8 00 00 00 00          mov    $0x0,%eax
  4004da:       e8 e1 fe ff ff          callq  4003c0 <printf@plt>
  4004df:       b8 00 00 00 00          mov    $0x0,%eax
  4004e4:       c9                      leaveq 
  4004e5:       c3                      retq 

Here it is callq 4003c0. But the binary instruction is e8 e1 fe ff ff. There is nothing that corresponds to 4003c0. What is that i am getting wrong?

Thanks. Bala


Solution

  • In the first case, take a look at the instruction encoding - it's all zeroes where the function address would go. That's because the object hasn't been linked yet, so the addresses for external symbols haven't been hooked up yet. When you do the final link into the executable format, the system sticks another placeholder in there, and then the dynamic linker will finally add the correct address for printf() at runtime. Here's a quick example for a "Hello, world" program I wrote.

    First, the disassembly of the object file:

    00000000 <_main>:
       0:   8d 4c 24 04             lea    0x4(%esp),%ecx
       4:   83 e4 f0                and    $0xfffffff0,%esp
       7:   ff 71 fc                pushl  -0x4(%ecx)
       a:   55                      push   %ebp
       b:   89 e5                   mov    %esp,%ebp
       d:   51                      push   %ecx
       e:   83 ec 04                sub    $0x4,%esp
      11:   e8 00 00 00 00          call   16 <_main+0x16>
      16:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
      1d:   e8 00 00 00 00          call   22 <_main+0x22>
      22:   b8 00 00 00 00          mov    $0x0,%eax
      27:   83 c4 04                add    $0x4,%esp
      2a:   59                      pop    %ecx
      2b:   5d                      pop    %ebp
      2c:   8d 61 fc                lea    -0x4(%ecx),%esp
      2f:   c3                      ret    
    

    Then the relocations:

    main.o:     file format pe-i386
    
    RELOCATION RECORDS FOR [.text]:
    OFFSET   TYPE              VALUE 
    00000012 DISP32            ___main
    00000019 dir32             .rdata
    0000001e DISP32            _puts
    

    As you can see there's a relocation there for _puts, which is what the call to printf turned into. That relocation will get noticed at link time and fixed up. In the case of dynamic library linking, the relocations and fixups might not get fully resolved until the program is running, but you'll get the idea from this example, I hope.