Search code examples
assemblylinkerelfi386relocation

Why function that refers to a global function in the same section can only be solved at link time while local functions will be solve at compile time?


I have this assembly file prog.S :

    .text
#------------------------------main----------------------------------
    .globl main
    .type main,@function
main:
pushl %ebp
movl %esp, %ebp
call myGlobalFunction   # call to my global func
call myLocalFunction    # call to my local func 
popl %ebp
ret
.Lmain_end:
    .size main, .Lmain_end-main
#-------------------------myGlobalFunction---------------------------
    .globl myGlobalFunction
    .type myGlobalFunction,@function
myGlobalFunction:
pushl %ebp
movl %esp, %ebp
movl $1, %eax           # return 1
popl %ebp
ret
.LmyGlobalFunction_end:
    .size myGlobalFunction, .LmyGlobalFunction_end-myGlobalFunction
#-------------------------myLocalFunction---------------------------
    .type myLocalFunction,@function
myLocalFunction:
pushl %ebp
movl %esp, %ebp
movl $2, %eax           #return 2
popl %ebp
ret
.LmyLocalFunction:
    .size myLocalFunction, .LmyLocalFunction-myLocalFunction

With the three functions being in the same .text section, and main that call the global function and then the local one.

When I compile this file with clang :

clang.exe -target i386-none prog.S -o prog.o -c

Here is the sections I got :

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          00000000 0000a0 000041 00      0   0  1
  [ 2] .text             PROGBITS        00000000 000034 000023 00  AX  0   0  4
  [ 3] .rel.text         REL             00000000 000098 000008 08   I  4   2  4
  [ 4] .symtab           SYMTAB          00000000 000058 000040 10      1   2  4

the relocation section '.rel.text' :

Relocation section '.rel.text' at offset 0x98 contains 1 entry:
 Offset     Info    Type            Sym.Value  Sym. Name
00000004  00000302 R_386_PC32        0000000f   myGlobalFunction

the symbol table :

Symbol table '.symtab' contains 4 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000019    10 FUNC    LOCAL  DEFAULT    2 myLocalFunction
     2: 00000000    15 FUNC    GLOBAL DEFAULT    2 main
     3: 0000000f    10 FUNC    GLOBAL DEFAULT    2 myGlobalFunction

and the .text section disassembly code

00000000 <main>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   e8 fc ff ff ff          call   4 <main+0x4>
   8:   e8 0c 00 00 00          call   19 <myLocalFunction>
   d:   5d                      pop    %ebp
   e:   c3                      ret

0000000f <myGlobalFunction>:
   f:   55                      push   %ebp
  10:   89 e5                   mov    %esp,%ebp
  12:   b8 01 00 00 00          mov    $0x1,%eax
  17:   5d                      pop    %ebp
  18:   c3                      ret

00000019 <myLocalFunction>:
  19:   55                      push   %ebp
  1a:   89 e5                   mov    %esp,%ebp
  1c:   b8 02 00 00 00          mov    $0x2,%eax
  21:   5d                      pop    %ebp
  22:   c3                      ret

I can see from the disassembly code that the PC relative call has already been solved for the local function but the global function is yet to be solve at the linking process..

What I don't understand is why the assembler cannot solve the global function call which is in the same section that the main function (and therefore, from what I understood, will always be at the same relative location from main since they are in the same section).

Obviously, if 'main' and 'myGlobalFunction' wouldn't have been in the same section, the assembler could not solve the PC relative call since the linker can put the different section 'where it want' (or where we want by giving a linker script).

Edit : there is a similar question but there is not a proper answer to it


Solution

  • The assembler can solve jumping/calling myGlobalFunction defined in the same section at asm-time, and sometimes it does so, as Peter Cordes investigated. However, as the function is declared global, it is assumed to be available from other sections, too.

    Assembler thinks that your .text section from the file prog.o might be statically linkable to other programs at link-time. You are right that in such case other.o declares myGlobalFunction as external, the relocation record should be generated into other.o, and relocation of call myGlobalFunction in prog.o is superabundant. Perhaps clang.exe assumes that the symbol myGlobalFunction is potentially weak and that it could be replaced at link-time with homonymous global symbol defined in someother.o, also linked together with other.o and prog.o.

    Call of a global function in the same section could be resolved at compile time. My guess is that Clang defers this to link-time and generates RIP-relative relocation to enable future replacement of the target function from other module.