Search code examples
linuxshared-librariesx86-64gnu-assemblergot

How to access a C global variable through GOT in GAS assembly on x86-64 Linux?


My problem

I am trying to write a shared library(not an executable, so please do not tell me to use -no-pie) with assembly and C in separate files(not inline assembly).

And I would like to access a C global variable through Global Offset Table in assembly code, because the function called might be defined in any other shared libraries.

I know the PLT/GOT stuff but I do not know for sure how to tell the compiler to correctly generate relocation information for the linker(what is the syntax), and how to tell the linker to actually relocate my code with that information(what is the linker options).

My code compiles with a linking error

/bin/ld: tracer.o: relocation R_X86_64_PC32 against
/bin/ld: final link failed: bad value

Furthermore, it would be better if someone could share some detailed documentation on the GAS assembly about relocation. For example, an exhaustive list on how to interpolate between C and assembly with GNU assembler.

Source Code

Compile the C and assembly code and link the into ONE shared library.

# Makefile
liba.so: tracer2.S target2.c
    gcc -shared -g -o liba.so tracer2.S target2.c
// target2.c
// NOTE: This is a variable, not a function.
int (*read_original)(int fd, void *data, unsigned long size) = 0;
// tracer2.S
.text
    // external symbol declarition
    .global read_original
read:
  lea read_original(%rip), %rax
  mov (%rax), %rax
  jmp *%rax

Expectation and Result

I expect the linker to happily link my object files but it says

g++ -shared -g -o liba.so tracer2.o target2.c -ldl
/bin/ld: tracer.o: relocation R_X86_64_PC32 against
/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
make: *** [Makefile:2: liba.so] Error 1

and commenting out the line

// lea read_original(%rip), %rax

makes the error disappear.

Solution.

    lea read_original@GOTPCREL(%rip), %rax

The keyword GOTPCREL will tell the compiler this is a PC-relative relocation to GOT table. The linker will calculate the offset from current rip to the target GOT table entry.

You can verify with

$ objdump -d liba.so
    10e9:       48 8d 05 f8 2e 00 00    lea    0x2ef8(%rip),%rax        # 3fe8 <read_original@@Base-0x40>
    10f0:       48 8b 00                mov    (%rax),%rax
    10f3:       ff e0                   jmpq   *%rax

Thanks to Peter.

Some information that might be related or not

1. I can call a C function with
  call read@plt

objdump shows it calls into the correct PLT entry.

$ objdump -d liba.so
...
0000000000001109 <read1>:
    1109:       e8 22 ff ff ff          callq  1030 <read@plt>
    110e:       ff e0                   jmpq   *%rax
2. I can lea a PLT entry address correctly

0xffffff23 is -0xdd, 0x1109 - 0xdd = 102c

0000000000001020 <.plt>:
    1020:       ff 35 e2 2f 00 00       pushq  0x2fe2(%rip)        # 4008 <_GLOBAL_OFFSET_TABLE_+0x8>
    1026:       ff 25 e4 2f 00 00       jmpq   *0x2fe4(%rip)        # 4010 <_GLOBAL_OFFSET_TABLE_+0x10>
    102c:       0f 1f 40 00             nopl   0x0(%rax)

0000000000001030 <read@plt>:
    1030:       ff 25 e2 2f 00 00       jmpq   *0x2fe2(%rip)        # 4018 <read@GLIBC_2.2.5>
    1036:       68 00 00 00 00          pushq  $0x0
    103b:       e9 e0 ff ff ff          jmpq   1020 <.plt>

0000000000001109 <read1>:
    1109:       48 8d 04 25 23 ff ff    lea    0xffffffffffffff23,%rax
    1110:       ff
    1111:       ff e0                   jmpq   *%rax

Environment

  • Arch Linux 20190809
$ uname -a
Linux alex-arch 5.2.6-arch1-1-ARCH #1 SMP PREEMPT Sun Aug 4 14:58:49 UTC 2019 x86_64 GNU/Linux
$ gcc -v
Using built-in specs.
COLLECT_GCC=/bin/gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release --enable-default-pie --enable-default-ssp --enable-cet=auto
Thread model: posix
gcc version 9.1.0 (GCC)
$ ld --version
GNU ld (GNU Binutils) 2.32
Copyright (C) 2019 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

Solution

  • Apparently the linker enforces global vs. hidden visibility for symbols in ELF shared objects, not allowing "back door" access to symbols that participate in symbol-interposition (and thus can potentially be more than 2GB away.)

    To access it directly from other code in the same shared object with normal RIP-relative addressing, make the symbol hidden by setting its ELF visibility as such. (See also https://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/ and Ulrich Drepper's How to Write Shared Libraries)

    __attribute__ ((visibility("hidden")))
     int (*read_original)(int fd, void *data, unsigned long size) = 0;
    

    Then gcc -save-temps tracer2.S target2.c -shared -fPIC compiles/assembles + links a shared library. GCC also has options like -fvisibility=hidden that makes that the default, requiring explicit attributes on symbols you do want to export for dynamic linking. That's a very good idea if you have any globals that you use inside your library, to get the compiler to emit efficient code for using them. It also protects you from global name-clashes with other libraries. The GCC manuals strongly recommends it.

    It also works with g++; C++ name mangling only applies to function names, not variables (including function-pointers). But generally don't compile .c files with a C++ compiler.


    If you do want to support symbol interposition, you need to use the GOT; obviously you can just look at how the compiler does it:

    int glob;                 // with default visibility = default
    int foo() { return glob; }
    

    compiles to this asm with GCC -O3 -fPIC (without any visibility options, so global symbols are fully globally visible: exported from shared objects and participating in symbol interposition).

    foo:
            movq    glob@GOTPCREL(%rip), %rax
            movl    (%rax), %eax
            ret
    

    Obviously this is less efficient than mov glob(%rip), %eax so prefer keeping your global vars scoped to the library (hidden), not truly global.

    There are tricks you can do with weak aliases to let you export a symbol that only this library defines, and access that definition efficiently via a "hidden" alias.