Search code examples
cgccdynamic-libraryld-preload

In gcc is there any way to dynamically add a function call to the start of main()?


I'm dynamically overriding malloc() with a fast_malloc() implementation of mine in a glibc benchmark malloc speed test (glibc/benchtests/bench-malloc-thread.c), by writing these functions in my fast_malloc.c file:

// Override malloc() and free(); see: https://stackoverflow.com/a/262481/4561887

inline void* malloc(size_t num_bytes)
{
    static bool first_call = true;
    if (first_call)
    {
        first_call = false;
        fast_malloc_error_t error = fast_malloc_init();
        assert(error == FAST_MALLOC_ERROR_OK);
    }

    return fast_malloc(num_bytes);
}

inline void free(void* ptr)
{
    fast_free(ptr);
}

Notice that I have this inefficient addition to my malloc() wrapper to ensure fast_malloc_init() gets called first on just the first call, to initialize some memory pools. I'd like to get rid of that and dynamically insert that init call into the start of main(), without modifying the glibc code, if possible. Is this possible?

The downside of how I've written my malloc() wrapper so far is it skews my benchtest results making it look like my fast_malloc() is slower than it really is, because the init func gets timed by glibc/benchtests/bench-malloc-thread.c, and I have this extraneous if (first_call) which gets checked every malloc call.

Currently I dynamically override malloc() and free(), while calling the bench-malloc-thread executable, like this:

LD_PRELOAD='/home/gabriel/GS/dev/fast_malloc/build/libfast_malloc.so' \ 
glibc-build/benchtests/bench-malloc-thread 1

Plot I will be adding my fast_malloc() speed tests to (using this repo): enter image description here

LinkedIn post I made about this: https://www.linkedin.com/posts/gabriel-staples_software-engineering-tradeoffs-activity-6815412255325339648-_c8L.

Related:

  1. [my repo fork] https://github.com/ElectricRCAircraftGuy/malloc-benchmarks
  2. [how I learned how to generate *.so dynamic libraries in gcc] https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html
  3. Create a wrapper function for malloc and free in C

Solution

  • How to dynamically inject function calls before and after another executable's main() function.

    Here is a full, runnable example for anyone wanting to test this on their own. Tested on Linux Ubuntu 20.04.

    This code is all part of my eRCaGuy_hello_world repo.

    hello_world_basic.c:

    #include <stdbool.h> // For `true` (`1`) and `false` (`0`) macros in C
    #include <stdint.h>  // For `uint8_t`, `int8_t`, etc.
    #include <stdio.h>   // For `printf()`
    
    // int main(int argc, char *argv[])  // alternative prototype
    int main()
    {
        printf("This is the start of `main()`.\n");
        printf("  Hello world.\n");
        printf("This is the end of `main()`.\n");
    
        return 0;
    }
    

    dynamic_func_call_before_and_after_main.c:

    #include <assert.h>
    #include <stdbool.h> // For `true` (`1`) and `false` (`0`) macros in C
    #include <stdint.h>  // For `uint8_t`, `int8_t`, etc.
    #include <stdio.h>   // For `printf()`
    #include <stdlib.h>  // For `atexit()`
    
    /// 3. This function gets attached as a post-main() callback (a sort of program "destructor")
    /// via the C <stdlib.h> `atexit()` call below
    void also_called_after_main()
    {
        printf("`atexit()`-registered callback functions are also called AFTER `main()`.\n");
    }
    
    /// 1. Functions with gcc function attribute, `constructor`, get automatically called **before**
    /// `main()`; see:
    /// https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
    __attribute__((__constructor__))
    void called_before_main()
    {
        printf("gcc constructors are called BEFORE `main()`.\n");
    
        // 3. Optional way to register a function call for AFTER main(), although
        // I prefer the simpler gcc `destructor` attribute technique below, instead.
        int retcode = atexit(also_called_after_main);
        assert(retcode == 0); // ensure the `atexit()` call to register the callback function succeeds
    }
    
    /// 2. Functions with gcc function attribute, `destructor`, get automatically called **after**
    /// `main()`; see:
    /// https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
    __attribute__((__destructor__))
    void called_after_main()
    {
        printf("gcc destructors are called AFTER `main()`.\n");
    }
    

    How to build and run the dynamic lib*.so shared-object library and dynamically load it with LD_PRELOAD as you run another program (see "dynamic_func_call_before_and_after_main__build_and_run.sh from my eRCaGuy_hello_world repo"):

    # 1. Build the other program (hello_world_basic.c) that has `main()` in it which we want to use
    mkdir -p bin && gcc -Wall -Wextra -Werror -O3 -std=c11 -save-temps=obj hello_world_basic.c \
    -o bin/hello_world_basic
    
    # 2. Create a .o object file of this program, compiling with Position Independent Code (PIC); see
    # here: https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html
    gcc -Wall -Wextra -Werror -O3 -std=c11 -fpic -c dynamic_func_call_before_and_after_main.c \
    -o bin/dynamic_func_call_before_and_after_main.o
    
    # 3. Link the above PIC object file into a dynamic shared library (`lib*.so` file); link above shows
    # we must use `-shared`
    gcc -shared bin/dynamic_func_call_before_and_after_main.o -o \
    bin/libdynamic_func_call_before_and_after_main.so
    
    # 4. Call the other program with `main()` in it, dynamically injecting this code into that other
    # program via this code's .so shared object file, and via Linux's `LD_PRELOAD` trick
    LD_PRELOAD='bin/libdynamic_func_call_before_and_after_main.so' bin/hello_world_basic
    

    Sample output. Notice that we have injected some special function calls both before AND after the main() function found in "hello_world_basic.c":

    gcc constructors are called BEFORE `main()`.
    This is the start of `main()`.
      Hello world.
    This is the end of `main()`.
    gcc destructors are called AFTER `main()`.
    `atexit()`-registered callback functions are also called AFTER `main()`.
    

    References:

    1. How to build dynamic lib*.so libraries in Linux: https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html
    2. @kaylum's comment
    3. @Employed Russian's answer
    4. @Lundin's comment
    5. gcc constructor and destructor function attributes!: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
    6. c atexit() func to register functions to be called AFTER main() returns or exits!: https://en.cppreference.com/w/c/program/atexit