Search code examples
pythoncdlsymld-preload

can't intercept PyDict_New with LD_PRELOAD


I'm trying to use LD_PRELOAD to intercept the PyDict_New function. I've verified that this recipe workd with getpid in the python interpreter, and I've adapted it to use PyDict_New instead, but it simply doesn't work as I expect. Although I'm clearly allocating dictionaries, and this function must be used, my override isn't being called.

What am I doing wrong?


Background: I'm trying to debug a problem in a very large system. I've found that there is a dict with a bad reference count. I know where the dict is first allocated, and where the problem manifests, but I'm pretty certain that the count goes bad at some intermediate time, and a simple code trace won't do, as the dict is cached and re-used (via PyDict_New) by the gc system.


Solution

  • LD_PRELOAD can only overload functions that are themselves loaded dynamically. If you're using the python binary, PyDict_New is not dynamically loaded and therefore there's no way for the dynamic loader to intercept the resolution of that symbol. If you instead create your own "python" by compiling your own binary and linking with libpython.so, it should work. Here's what you'd need to put in your program (/tmp/foo.c):

    #include "Python.h"
    
    int
    main(int argc, char **argv)
    {
        return Py_Main(argc, argv);
    }
    

    And you can simply build it with: gcc -o foo -I/usr/include/python2.7 foo.c -lpython2.7

    After you do this, LD_PRELOAD on ./foo should work.