Search code examples
linuxmemorygdbobjdump

Computing offset of a function in memory


I am reading documentation for a uprobe tracer and there is a instruction how to compute offset of a function in memory. I am quoting it here.

Following example shows how to dump the instruction pointer and %ax register at the probed text address. Probe zfree function in /bin/zsh:

# cd /sys/kernel/debug/tracing/
# cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g    DF .text  0000000000000012  Base        zfree

0x46420 is the offset of zfree in object /bin/zsh that is loaded at 0x00400000.

I do not know why, but they took output 0x446420 and subtracted 0x400000 to get 0x46420. It seamed as an error to me. Why 0x400000?

I have tried to do the same on my Fedora 23 with 4.5.6-200 kernel.

First I turned off memory address randomization

echo 0 > /proc/sys/kernel/randomize_va_space

Then I figured out where binary is in memory

$ cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
555555554000-55555560f000 r-xp 00000000 fd:00 2387155                    /usr/bin/zsh

Took the offset

marko@fedora:~ $ objdump -T /bin/zsh | grep -w zfree
000000000005dc90 g    DF .text  0000000000000012  Base        zfree

And figured out where zfree is via gdb

$ gdb -p 21067 --batch -ex 'p zfree'
$1 = {<text variable, no debug info>} 0x5555555b1c90 <zfree>

marko@fedora:~ $ python
Python 2.7.11 (default, Mar 31 2016, 20:46:51) 
[GCC 5.3.1 20151207 (Red Hat 5.3.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hex(0x5555555b1c90-0x555555554000)
'0x5dc90'

You see, I've got the same result as in objdump without subtracting anything.

But then I tried the same on another machine with SLES and there it's the same as in uprobe documentation.

Why is there such a difference? How do I compute correct offset then?


Solution

  • As far as I see the difference may be caused only by the way how examined binary was built. Saying more precisely - if ELF has fixed load address or not. Lets do simple experiment. We have simple test code:

    int main(void) { return 0; }
    

    Then, build it in two ways:

    $ gcc -o t1 t.c      # create image with fixed load address
    $ gcc -o t2 t.c -pie # create load-base independent image
    

    Now, lets check load base addresses for these two images:

    $ readelf -l --wide t1 | grep LOAD
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x00067c 0x00067c R E 0x200000
      LOAD           0x000680 0x0000000000600680 0x0000000000600680 0x000228 0x000230 RW  0x200000
    $ readelf -l --wide t2 | grep LOAD
      LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0008cc 0x0008cc R E 0x200000
      LOAD           0x0008d0 0x00000000002008d0 0x00000000002008d0 0x000250 0x000258 RW  0x2000
    

    Here you can see that first image requires fixed load address - 0x400000, and the second one has no address requirements at all.

    And now we can compare addresses that objdump tells about main:

    $ objdump -t t1 | grep ' main'
    00000000004004b6 g     F .text  000000000000000b              main
    $ objdump -t t2 | grep ' main'
    0000000000000710 g     F .text  000000000000000b              main
    

    As we see, the address is a complete virtual address that first byte of main will occupy if image is loaded at address, stored in program header. And of course the second image never won't be loaded at 0x0 but instead at another, randomly chosen location, that will offset real function position.