Search code examples
assemblyx86-64elfgnu-assemblerbinutils

How to handle the "relocation R_X86_64_PC32 against protected symbol" when writing shared objects in assembly?


I am writing an object file in assembly language to be included in a shared object. I am using the GNU toolchain, and my target is x86_64-pc-linux-gnu. Please consider the following (example) source:

        .text
        .globl  f
f:      leaq    g(%rip),%rax
        ret
        .data
        .globl  g
        .protected g
g:      .quad   8

The crucial parts are the global protected symbol g and the reference to g in f. When I assemble the source using gcc -c -o example.o --shared -fpic example.s, objdump -x tells me that gas inserted a relocation for the local reference (some relocation entry is obviously necessary):

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE
0000000000000003 R_X86_64_PC32     g-0x0000000000000004

The problem shows up when I try also to link the file:

$ gcc -o example.so --shared -fpic example.s
/usr/bin/ld: /tmp/ccQ6BcLl.o: relocation R_X86_64_PC32 against protected symbol `g' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

As far as I think I have understood by reading Ian Lance Taylor's blog (but I might be mistaken), this is due to the fact that the linker cannot guarantee pointer equality when symbol interposition happens (in some other object file).

As this will never be an actual problem with my symbol g in my shared object, I would like to silence ld. The Linux ABI 0.1 seems to say that I should add a .note.gnu.property section to my source that contains a setting of GNU_PROPERTY_NO_COPY_ON_PROTECTED. How can I do this in practice?

If possible, I don't want to add extra flags to the invocation of the assembler and linker, so I am looking for a solution where the necessary modifications are just part of the source file.


Solution

  • You cannot reference the symbol like this because while it is now protected from interposition, it is still subject to the same behaviour any object-type symbol exported from a shared library is. To fix this, go through the GOT like with any other exported object-type symbol.

    Background

    The designers of the ELF ABI intended shared objects to be transparent to the main program. ELF ABI programs (but not shared objects) are wholy ignorant to the presence of shared objects and are written as if all symbols used by the program were statically linked into the program. This includes object-type symbols, to which direct access is permitted. E.g. the main program can do

    movq    g(%rip), %rax
    

    and gets the value of the same variable g your shared library uses. The way this works is that for all object-type symbols referenced by your main program but provided by a shared library, the linker looks up the size of the symbol at link time and allocates that much space in the BSS segment of the executable. The symbol (g in this case) is resolved to point to that space.

    At load time, the dynamic loader finds the shared library that defines g and copies the data for g from the data segment of the shared object into the space reserved in the BSS segment of the main executable and resolves the GOT entry for g to that address. This is called a copy relocation. Thus, the shared library, when accessing g will access the same variable the main program can access. (If the main program does not access g, copy relocation does not take place and g is resolved to its definition within the shared library's data/BSS segment)

    However, this scheme only works if the shared library accesses the symbol through the GOT as the symbol is not relocated with the shared library. Thus, you must go through the GOT to access the symbol.

    I.e. do

            movq g@GOTPCREL(%rip), %rax  # find the address of g
            movq (%rax), %rax            # load the value of g
    

    The address of g will not change while the program is running, so it suffices to do this once at the beginning of your code. The overhead should be low.

    Workarounds

    Workarounds include:

    • consider making the symbol hidden and only exposing it to the main executable through an accessor function, returning its address. You can use map files (version scripts) to set the visibility for all functions in your library in one spot, which may be easier than annotating the symbols in the source files.

    • if it doesn't matter if the main executable and your library see the same address for the symbol (e.g. if it's a constant), you can provide a hidden alias for the symbol and use that for internal references

    • you can use -Bsymbolic to have the shared library always use its own copy of the symbol, even if it is subject to copy relocation. Be aware that this effectively disables the ability to share variables between shared library and main executable. You'll also not be able to compare function pointers for equality correctly. This option should not be used in production for this reason.

    • If you cannot use an accessor function for some reason, you could detour the exported symbol through a pointer, only allowing copy-relocations on the pointer:

              .bss
              .globl local_g
              .hidden local_g
      local_g:
              .space 8
              .data
              .globl g
      g:      .quad local_g
      

      In the main binary, declare g as holding a pointer to the variable and dereference it to access the variable. Consider declariing it const so it cannot be dereferenced by accident. Note that this approach performs worse than going through the GOT for accesses from other shared objects to the symbol.

    • You can use -mno-direct-extern-access during compilation of all program parts and linking (shared library and main executable) to avoid copy relocations (you might also need to link all parts with -Wz,nocopyreloc. Note that shared libraries compiled such are ABI-incompatible to main programs that were compiled without this option and must not be linked to them. The other way round is ok.

    The best option however is to just go through the GOT as with any other global symbol.