Consider the following piece of code:
$ cat foo.c
static int foo = 100;
int function(void)
{
return foo;
}
I understand the dissassembly of libfoo.so
$ gcc -m32 -fPIC -shared -o libfoo.so foo.c
$ objdump -D libfoo.so
000004cc <function>:
4cc: 55 push %ebp
4cd: 89 e5 mov %esp,%ebp
4cf: e8 0e 00 00 00 call 4e2 <__x86.get_pc_thunk.cx>
4d4: 81 c1 c0 11 00 00 add $0x11c0,%ecx
4da: 8b 81 18 00 00 00 mov 0x18(%ecx),%eax
4e0: 5d pop %ebp
4e1: c3 ret
000004e2 <__x86.get_pc_thunk.cx>:
4e2: 8b 0c 24 mov (%esp),%ecx
4e5: c3 ret
4e6: 66 90 xchg %ax,%ax
...
000016ac <foo>:
16ac: 64 00 00 add %al,%fs:(%eax)
In the function
the address of foo
is computed as 0x4d4 (the value of ecx
after the call to __x86.get_pc_thunk.cx
) + $0x11c0 + 0x18 = 0x16ac. And 0x16ac is the address of foo
.
However I do not understand the disassembly of
$ gcc -m32 -fPIC -shared -c foo.c
$ objdump -D foo.o
00000000 <function>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: e8 fc ff ff ff call 4 <function+0x4>
8: 81 c1 02 00 00 00 add $0x2,%ecx
e: 8b 81 00 00 00 00 mov 0x0(%ecx),%eax
14: 5d pop %ebp
15: c3 ret
00000000 <foo>:
0: 64 00 00 add %al,%fs:(%eax)
00000000 <__x86.get_pc_thunk.cx>:
0: 8b 0c 24 mov (%esp),%ecx
3: c3 ret
Why call 4 <function+0x4>
and why add $0x2,%ecx
?
Update: (added -r flag to objdump, -R flag produces the error not a dynamic object, Invalid operation
.
$ objdump -D -r foo.o
00000000 <function>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: e8 fc ff ff ff call 4 <function+0x4>
4: R_386_PC32 __x86.get_pc_thunk.cx
8: 81 c1 02 00 00 00 add $0x2,%ecx
a: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
e: 8b 81 00 00 00 00 mov 0x0(%ecx),%eax
10: R_386_GOTOFF .data
14: 5d pop %ebp
15: c3 ret
Now 4
makes sense in call 4 <function+0x4>
, because the offset of this instruction in the text section is 4. I still do not have any clue why 0x2
in add $0x2,%ecx
.
The linker will perform the relocation such that the final value
= symbol
+ offset
- PC
. Note that the PC
in this formula is the address of the relocation itself, not the address of the instruction because the linker has no idea about instruction boundaries. The assembler, however, knows about them and can create the proper offsets.
Let's see how the call __x86.get_pc_thunk.cx
works. On x86, the call
instruction uses relative addressing, but the value of the PC
is already incremented to point to the following instruction. You can verify this in your first dump:
4cf: e8 0e 00 00 00 call 4e2 <__x86.get_pc_thunk.cx>
4d4: 81 c1 c0 11 00 00 add $0x11c0,%ecx
Notice the offset in the instruction is 0e
. The already incremented PC
is 4d4
and sure enough the target of the jump 4e2
=4d4
+0e
(all numbers in hex).
Now for the version with the relocation:
3: e8 fc ff ff ff call 4 <function+0x4>
4: R_386_PC32 __x86.get_pc_thunk.cx
It uses R_386_PC32
but that is at the second byte of the instruction while the call
needs an offset from the updated PC
which is obviously 4
bytes more. This means the correct result is 4
less, hence the instruction contains fffffffc
which is -4
. Note that no matter what the address of the call
is, this offset is always going to be -4
. The disassembler will automatically add this to the updated PC
, which in this case is 8
, so it arrives at the call 4
by doing 8-4
.
Okay, on to the R_386_GOTPC
.
3: e8 fc ff ff ff call 4 <function+0x4>
4: R_386_PC32 __x86.get_pc_thunk.cx
8: 81 c1 02 00 00 00 add $0x2,%ecx
a: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
The __x86.get_pc_thunk.cx
function simply loads the return address from the stack into the register ecx
. This return address in this case is 8
. The goal to achieve is having the address of _GLOBAL_OFFSET_TABLE_
in ecx
. We need to know how far it is from the reference PC
already in ecx
and add that distance. For this the R_386_GOTPC
relocation is used, but that will give an offset from address 0a
because that's where the relocation entry is. The offset from address 8
will be of course 2
more. This 2
is what's encoded in the instruction.
To summarize: the relocation offset stored in the instruction is the difference of the relocation address and the required reference point: offset
= PC
- reference
. In the first case, this reference point is 4
bytes higher, in the second case, 2
bytes lower which gives offsets of -4
and 2
respectively.