Search code examples
cgccldexternld-preload

C - Externs - Safe way to monitor value in LD_PRELOAD'ed library


Background

I help maintain a simple command-line tool, diskmanager, used to monitor poor disk performance, primarily due to too many operations/users concurrently using the same disk. My work involves maintaining a library, libdisksupervisor.so, that is occasionally used to "supervise" the disk manager program by launching it via:

LD_PRELOAD=/public/libdisksupervisor.so /sbin/diskmanager

The reason we do this is because the library and the application have very different release schedules, source can't be shared due to cross-NDAs in place, etc. To make our lives easier, the maintainers of diskmanager created a few extern variables in the application, and added some calls to "dummy" functions in a library (libdonothing.so) that they bundled with diskmanager.

When a call is made to int dummy(void) (normally found in libdonothing.so, but we intercept it via LD_PRELOAD'ing libdisksupervisor.so, which also includes the same function prototype), we know that diskmanager is in a state where we can safely read extern int internalStatus (located in diskimager) from within our own library. The code for dummy() is quite simple:

# In source for diskmanager
int internalStatus = (-1);

# In libdummy.so
int dummy(void) { return 0; }

# In libdisksupervisor.so
extern int internalStatus;
int dummy(void) { syslog(LOG_ERR, "State:%d", internalStatus);

Problem

So far, so good. A few months back, one of the maintainers for diskmanager did something silly, and removed int internalStatus from diskmanager causing our library to cause a segmentation fault when executing LD_PRELOAD=/public/libdisksupervisor.so diskmanager. A similar issue arose when a junior engineer was fumbling with GCC hidden attributes and changing some values to be static, and again resulting in a segfault.


Question

Is there any way, within our code in libdisksupervisor.so, we can test for the presence of these extern (from the perspective of our library) variables before proceeding, possibly via some cryptic linker or GCC magic? I know I could just throw nm or objdump at it as part of a pre-validation script, but we need to accomplish this within our library alone.

Thank you.


Solution

  • Is there any way, within our code in libdisksupervisor.so, we can test for the presence of these extern (from the perspective of our library) variables before proceeding, possibly via some cryptic linker or GCC magic?

    You have a timing problem here. You in fact don't need to do anything special to test for the presence and visibility of those symbols at compile time, in the version of diskmanager that you link against. The issue arises when you attempt to use libdisksupervisor.so with a version of diskmanager that turns out at runtime to be incompatible.

    I know I could just throw nm or objdump at it as part of a pre-validation script, but we need to accomplish this within our c library alone.

    I am not aware of any approach that would work with the way you are running the program and would not be susceptible to being easily and accidentally foiled by diskmanager maintenance.

    But perhaps there is a way involving changing how you run the program. If what you presently call libdisksupervisor.so provided a program entry point (i.e. main()) and you ran it directly, it could dlopen() diskmanager and check for the presence of the needed symbols via dlsym(). It could then transfer control to diskmanager's main() (also accessed via dlsym()). You can think of this as inserting a shim between the system's dynamic linker and diskmanager.


    Update:

    The good news is that I have a proof-of-concept demonstrating that it can be done (see below). The bad news is that enabling the main executable to be loaded as a shared library requires special build options, and it sounds like it could be troublesome to get the other side to build with such options. On the other hand, this approach allows them to control and document precisely which symbols are exposed to your side, and maybe that would serve as a suitable carrot.

    Anyway, the POC consists of three C source files, two auxiliary files, and a Makefile:

    dummy.c

    int dummy(void) {
        return 0;
    }
    

    main.c

    #include <stdio.h>
    
    int dummy(void);
    
    #ifndef BREAKME
    int internalStatus = 42;
    #endif
    
    int main(int argc, char *argv[]) {
        printf("dummy() returns %d\n", dummy());
        return 0;
    }
    

    shim.c

    #include <stdlib.h>
    #include <stdio.h>
    #include <dlfcn.h>
    #include <assert.h>
    
    #define TARGET_PATH "./mainprog"
    #define NOT_FOUND_STATUS 127
    #define MISSING_SYM_STATUS 126
    
    typedef int (*main_type)(int, char **);
    
    static int *internalStatus_p;
    #define internalStatus (*internalStatus_p);
    
    int dummy(void) {
        return internalStatus;
    }
    
    #define LOAD_SYM(dso, name, var) do { \
        char *e_; \
        var = dlsym(dso, name); \
        e_ = dlerror(); \
        if (e_) { \
            fprintf(stderr, "%s\n", e_); \
            return MISSING_SYM_STATUS; \
        } \
    } while (0)
    
    int main(int argc, char *argv[]) {
        void *diskmanager_bin = dlopen(TARGET_PATH, RTLD_LAZY | RTLD_GLOBAL);
        char *error;
        main_type main_p;
    
        if (!diskmanager_bin) {
            fprintf(stderr, "Could not load " TARGET_PATH ": %s\naborting\n", dlerror());
            return NOT_FOUND_STATUS;
        } else {
            error = dlerror();
            assert(!error);
        }
    
        LOAD_SYM(diskmanager_bin, "internalStatus", internalStatus_p);
        LOAD_SYM(diskmanager_bin, "main", main_p);
    
        return main_p(argc, argv);
    }
    
    #undef LOAD_SYM
    

    mainprog_dynamic

    {
        main; internalStatus;
    };
    

    shim_dynamic

    {
        dummy;
    };
    

    Makefile

    # sources contributing to a shared library must be built with -fpic or -fPIC
    CFLAGS = -fPIC -std=c99
    LDFLAGS = 
    
    SHLIB_LDFLAGS = -shared
    SHLIB_EXTRALIBS = -lc
    
    # Sources contributing to the main program should be built with -fpie or -fPIE
    SHMAIN_CFLAGS = -fpie
    # The main program must be linked with -pie
    SHMAIN_LDFLAGS = -pie
    
    DL_EXTRALIBS = -ldl
    
    LIBDUMMY_SO_VER = 0
    LIBDUMMY = libdummy.so.$(LIBDUMMY_SO_VER)
    
    all: mainprog shim
    
    mainprog: main.o $(LIBDUMMY) mainprog_dynamic
        $(CC) $(CFLAGS) $(SHMAIN_CFLAGS) $(LDFLAGS) $(SHMAIN_LDFLAGS) -Wl,--dynamic-list=mainprog_dynamic -o $@ $< $(LIBDUMMY) $(SHLIB_EXTRALIBS)
    
    main.o: main.c
        $(CC) $(CPPFLAGS) $(CFLAGS) $(SHMAIN_CFLAGS) -c -o $@ $<
    
    libdummy.so.$(LIBDUMMY_SO_VER): libdummy.so
        ln -sf $< $@
    
    libdummy.so: dummy.o
        $(CC) -o $@ $(CFLAGS) $(LDFLAGS) $(SHLIB_LDFLAGS) -Wl,-soname,libdummy.so.$(LIBDUMMY_SO_VER) $^ $(SHLIB_EXTRALIBS)
    
    shim: shim.o shim_dynamic
        $(CC) $(CFLAGS) $(LDFLAGS) -Wl,--dynamic-list=shim_dynamic -o $@ $< $(DL_EXTRALIBS)
    
    test: all
        @echo "LD_LIBRARY_PATH=`pwd` ./mainprog :"
        @LD_LIBRARY_PATH=`pwd` ./mainprog
        @echo "LD_LIBRARY_PATH=`pwd` ./shim :"
        @LD_LIBRARY_PATH=`pwd` ./shim
    
    clean:
        rm -f *.o *.so *.so.* mainprog shim
    

    This models the situation you describe, where the function you want to override resides in a separate shared library. It assumes the GNU toolchain. Having successfully built the example (make all), you can make test for a demo:

    $ make test
    LD_LIBRARY_PATH=/tmp/dl ./mainprog :
    dummy() returns 0
    LD_LIBRARY_PATH=/tmp/dl ./shim :
    dummy() returns 42
    

    The *_dynamic files tell the linker about symbols in the two executables that should be included among the exported (dynamic) symbols, even though nothing in the link references them.

    This approach does not allow the shim to refer to the main program's internalStatus variable directly, for then the shim would need to link the main program as a library, and it would be automatically loaded by the dynamic linker when the shim runs. References to variables are always bound immediately, so that would result in an error from the dynamic linker if internalStatus disappeared, outside the control of the shim.