Search code examples
pythonlinkerapache-arrowpython-extensions

Undefined symbol at runtime. Import Python C++ extension


I have a python package (my_python_package), part of which is a C++ extension (my_ext) with a single function (my_ext_func). The extension depends on my C++ library (libmycpp) and my C++ library depends on libarrow. The problem is that I get an error while importing a function from the extension:

$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from my_python_package.my_ext import my_ext_func
Traceback (most recent call last):
...
ImportError: /usr/local/lib/libmycpp.so: undefined symbol: _ZNK5arrow6Status8ToStringB5cxx11Ev

libmycpp builds and links correctly and i have some C++ executables that work fine with libmycpp.

I made sure that the extension is correctly linked to the arrow library:

ldd my_ext.so | grep arrow
libarrow.so.1200 => /lib/x86_64-linux-gnu/libarrow.so.1200 (0x00007f2f27f18000)

If I go to the directory in ~/.local/lib/python3.8/site-packages/... with my_ext.so, run the python console and try to import it, everything goes smoothly with no error:

$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from my_ext import my_ext_func
>>> 

It is clear that the problem is about python runtime, but I don't know how to fix it. Please help me to do this.

UPD: I also found that if i go to the directory with my_ext module and try importing in it, everything goes smoothly:

$ cd ~/.local/lib/python3.8/site-packages/my_python_package/my_ext
$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from my_python_package.my_ext import my_ext_func
>>>

But if I import another module that imports my_ext.my_ext_func, I get an error.

UPD 2: When i set LD_PRELOAD=/lib/x86_64-linux-gnu/libarrow.so.1200 before starting python3, everithing is ok. So i'm totally confused.


Solution

  • So i'm totally confused.

    The problem here is that something changes when you change the working directory or use LD_PRELOAD, but stackoverflow is not really productive in performing "live debugging" to figure out what that something is.

    In order to learn "how to fish", do this:

    Step 0. In your home directory, run this command:

    python3 -c "from my_python_package.my_ext import my_ext_func"
    

    This should reproduce the error.

    Step 1. Repeat with:

    env LD_DEBUG=symbols,bindings LD_DEBUG_OUTPUT=/tmp/my_ext \
      python3 -c "from my_python_package.my_ext import my_ext_func"
    

    This should reproduce the error again, and there should be a (large) /tmp/my_ext.$pid file. That file should mention the _ZNK5arrow6Status8ToStringB5cxx11Ev symbol.

    Step 2. Repeat the same command with after changing directory to ~/.local/lib/python3.8/site-packages/my_python_package/my_ext.

    Look for differences between the two /tmp/my_ext.... files. In particular, look for different paths to shared libraries being used in the two scenarios.

    Repeat again after changing back to $HOME directory and adding LD_PRELOAD=...:

    env LD_DEBUG=symbols,bindings LD_DEBUG_OUTPUT=/tmp/my_ext \
      LD_PRELOAD=/lib/x86_64-linux-gnu/libarrow.so.1200 \
      python3 -c "from my_python_package.my_ext import my_ext_func"
    

    That is what I would have done in your situation, and should give you enough fish for a few days ;-)

    If that is still not enough, you should probably ask a new (now much more detailed) question with relevant snippets from the debug files produced above.