Search code examples
linuxgccshared-librariesldld-preload

A function declared with __attribute__((constructor)) is invoked more than once with LD_PRELOAD


Define a shared library as follows:

#include <unistd.h>
#include <stdio.h>

static void init(void) __attribute__((constructor));
static void init(void)
{
    fprintf(stderr, "pid=%u\n", (unsigned) getpid());
}

Build using GCC 10 on an AMD64 machine:

gcc -shared -fPIC lib.c

Run a trivial process to validate:

$ LD_PRELOAD=`realpath a.out` ls
pid=15771
a.out  lib.c

The line pid=15771 is printed by init() as expected.

Now, repeat with a complex process that spawns children and threads:

$ LD_PRELOAD=`realpath a.out` python3
pid=15835
pid=15835
pid=15835
pid=15835
pid=15839
pid=15844
pid=15835
pid=15835
pid=15846
pid=15846
pid=15847
pid=15847
pid=15849
pid=15849
pid=15851
pid=15852
pid=15853
pid=15853
pid=15856
pid=15857
pid=15857
pid=15858
pid=15858
pid=15861
pid=15862
pid=15862
pid=15865
pid=15868
pid=15835
Python 3.8.2 (default, Apr 19 2020, 18:33:14) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Observe that there are repeated entries like pid=15835, which indicates that init() has been executed more than once for some of the processes.

Why?


Solution

  • Your Python installation executes some programs when the Python interpreter is launched. If a process image is replaced using execve, the process ID does not change, but constructors run for the new process image.

    A simpler example looks like this:

    $ LD_PRELOAD=`realpath a.out` bash -c 'exec /bin/true'
    

    You can see more details by invoking strace:

    $ strace -f -E LD_PRELOAD=`realpath a.out` -eexecve bash -c 'exec /bin/true'
    execve("/usr/bin/bash", ["bash", "-c", "exec /bin/true"], 0x55b275d8b830 /* 29 vars */) = 0
    pid=801315
    execve("/bin/true", ["/bin/true"], 0x5564dedbeb80 /* 29 vars */) = 0
    pid=801315
    +++ exited with 0 +++