Unprotect start and end of ELF section in library, to override from linked program

I would like to get the start and end pointers of a section in a library, in a way that it can be overridden from the program to which the program is being linked.

This allows me to specify in the program some parameters as to how the library should load. Here's a concrete example:

foo.c, the library:

#include <stdio.h>

typedef void (*fptr)();
void lib_function();

void dummy()
{
    printf("NO -- I should be overriden by prog_function\n");
}

fptr section_fptrlist __attribute__((weak, section("fptrlist"))) = (fptr)dummy;
extern fptr __start_fptrlist;
extern fptr __stop_fptrlist;

void __attribute__((constructor)) setup()
{
    // setup library: call pre-init functions;
    for (fptr *f = &__start_fptrlist; f != &__stop_fptrlist; f++)
        (*f)();
}

void lib_function()
{
}

bar.c, the program:

#include <stdio.h>

void lib_function();
typedef void (*fptr)();


void pre_init()
{
    printf("OK -- run me from library constructor\n");
}

fptr section_fptrlist __attribute__((section("fptrlist"))) = (fptr)pre_init;

int main()
{
    lib_function();
    return 0;
}

I build libfoo.so from foo.c and then a test program from bar.c and libfoo.so, for example:

gcc -g -O0    -fPIC -shared foo.c -o libfoo.so
gcc -g -O0    bar.c -L. -lfoo -o test

This used to work fine, i.e. with ld version 2.26.1 I get as expected:

$ ./test 
OK -- run me from library constructor

Now with ld version 2.29.1 I get:

$ ./test 
NO -- I should be overriden by prog_function

I have compiled everything on one machine, and only changed the linker step by copying the object file to a different machine, running ld -shared foo.o -o libfoo.so and copying the library back, so as far as I can tell the linker is the only difference between this working and not working.

I further use gcc 7.2.0 and glibc 2.22-62 but as stated above that doesn't seem to be decisive. The differences in the linker scripts seem minor and using one instead of the other does not seem to make any difference to the result so far (2.26 with 2.29's script does work as excepted, 2.29 with 2.26's script does not). Here's the diff anyway:

--- ld_script_v2.26     2018-02-02 21:52:56.038573732 +0100
+++ ld_script_v2.29     2018-02-02 21:52:41.154504340 +0100
@@ -1,4 +1,4 @@
@@ -53,5 +53,5 @@ SECTIONS
   .plt            : { *(.plt) *(.iplt) }
 .plt.got        : { *(.plt.got) }
-.plt.bnd        : { *(.plt.bnd) }
+.plt.sec        : { *(.plt.sec) }
   .text           :
   {
@@ -226,4 +226,5 @@ SECTIONS
   /* DWARF Extension.  */
   .debug_macro    0 : { *(.debug_macro) }
+  .debug_addr     0 : { *(.debug_addr) }
   .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) }
   /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) }

Looking at the dynamic symbol table (with readelf -Ws) I noticed that in the 2.29 versions the symbols are now protected:

with ld 2.29> readelf -Ws libfoo.so  | grep fptr
 8: 0000000000201028     0 NOTYPE  GLOBAL PROTECTED   24 __start_fptrlist
14: 0000000000201028     8 OBJECT  WEAK   DEFAULT     24 section_fptrlist
16: 0000000000201030     0 NOTYPE  GLOBAL PROTECTED   24 __stop_fptrlist
54: 0000000000201028     8 OBJECT  WEAK   DEFAULT     24 section_fptrlist
58: 0000000000201028     0 NOTYPE  GLOBAL PROTECTED   24 __start_fptrlist
62: 0000000000201030     0 NOTYPE  GLOBAL PROTECTED   24 __stop_fptrlist

whith ld 2.26> readelf -Ws libfoo.so | grep fptrlist                                                                                                                                                                                                    
 9: 0000000000201028     0 NOTYPE  GLOBAL DEFAULT   23 __start_fptrlist
15: 0000000000201028     8 OBJECT  WEAK   DEFAULT   23 section_fptrlist
17: 0000000000201030     0 NOTYPE  GLOBAL DEFAULT   23 __stop_fptrlist
53: 0000000000201028     8 OBJECT  WEAK   DEFAULT   23 section_fptrlist
57: 0000000000201028     0 NOTYPE  GLOBAL DEFAULT   23 __start_fptrlist
61: 0000000000201030     0 NOTYPE  GLOBAL DEFAULT   23 __stop_fptrlist

I am aware, from this answer, that the feature I was relying on is more shady that well defined. I was able to track down the fact that this change was intentional. What is now the best way for me to achieve the goal of calling program function(s) from my library setup? Can I still make this approach work? Is there a way to un-protect those symbols for example?

Even though this example is small, this problem actually happens in a pretty big C++ project, so the less changes the better.

Solution

I think the problem is

for (fptr *f = &__start_fptrlist; f != &__stop_fptrlist; f++)
    (*f)();

here the for-loop is expected to go through __start_fptrlist and __stop_fptrlist defined in your app. While the .protected makes those symbols resolved from the .so itself.

A simple workaround will be:

// foo.c
/* ... */
fptr *my_start = &__start_fptrlist;
fptr *my_stop = &__stop_fptrlist;

void __attribute__((constructor)) setup()
{
    // setup library: call pre-init functions;
    for (fptr *f = my_start; f != my_stop; f++)
        (*f)();
}

Here the exact value of my_* function is not important, because these 2 names are supposed to be bound to the value from app.

// bar.c
/* ... */
fptr section_fptrlist __attribute__((section("fptrlist"))) = (fptr)pre_init;

extern fptr __start_fptrlist;
extern fptr __stop_fptrlist;

fptr *my_start = &__start_fptrlist;
fptr *my_stop = &__stop_fptrlist;

This will force your for-loop go through the addresses from your app instead of your .so. Because my_* symbols are global and they will be first resolved from app.

Warning: Code not tested, since I don't have the environment as you described. Please let me know whether this approach works on your machine.