Search code examples
pythonctypescircular-dependencydlopenpython-cffi

Resolving circular shared-object dependencies with ctypes/cffi


I would like to use cffi (or even ctypes if I must) to access a C ABI from Python 3 on Linux. The API is implemented by a number of .so files (let's call them libA.so, libB.so and libC.so), such that libA contains the main exported functions, and the other libs provide support for libA.

Now, libA depends on libB and libB depends on libC. However, there's a problem. There's a global array defined by libA that libC expects to be present. So libC actually depends on libA - a circular dependency. Trying to use cffi or ctags equivalent to dlopen to load libA results in missing symbols from libB and libC, but trying to load libC first results in an error about the missing array (which is in libA).

Since it's a variable, rather than a function, the RTLD_LAZY option doesn't seem to apply here.

Oddly, ldd libA.so doesn't show libB or libC as dependencies so I'm not sure if that's part of the problem. I suppose that relies on any program that links with these libraries to explicitly specify them all.

Is there a way to get around this? One idea was to create a new shared object (say, "all.so") that is dependent on libA, libB and libC so that dlopen("all.so") might load everything it needs in one go, but I can't get this to work either.

What's the best strategy to handle this situation? In reality, the ABI I'm trying to access is pretty large, with perhaps 20-30 shared object files.


Solution

  • This (if I understood the problem correctly,) is a perfectly normal usecase on Nix, and should run without problems.

    When dealing with problems related to ctypes ([Python 3]: ctypes - A foreign function library for Python), the best (generic) way to tackle them is:

    • Write a (small) C application that does the required job (and of course, works)
    • Only then move to ctypes (basically this is translating the above application)

    I prepared a small (and dummy) example:

    • defines.h:

      #pragma once
      
      #include <stdio.h>
      
      #define PRINT_MSG_0() printf("From C: [%s] (%d) - [%s]\n", __FILE__, __LINE__, __FUNCTION__)
      
    • libC:

      • libC.h:

        #pragma once
        
        
        size_t funcC();
        
      • libC.c:

        #include "defines.h"
        #include "libC.h"
        #include "libA.h"
        
        
        size_t funcC() {
            PRINT_MSG_0();
            for (size_t i = 0; i < ARRAY_DIM; i++)
            {
                printf("%zu - %c\n", i, charArray[i]);
            }
            printf("\n");
            return ARRAY_DIM;
        }
        
    • libB:

      • libB.h:

        #pragma once
        
        
        size_t funcB();
        
      • libB.c:

        #include "defines.h"
        #include "libB.h"
        #include "libC.h"
        
        
        size_t funcB() {
            PRINT_MSG_0();
            return funcC();
        }
        
    • libA:

      • libA.h:

        #pragma once
        
        #define ARRAY_DIM 3
        
        
        extern char charArray[ARRAY_DIM];
        
        size_t funcA();
        
      • libA.c:

        #include "defines.h"
        #include "libA.h"
        #include "libB.h"
        
        
        char charArray[ARRAY_DIM] = {'A', 'B', 'C'};
        
        
        size_t funcA() {
            PRINT_MSG_0();
            return funcB();
        }
        
    • code.py:

      #!/usr/bin/env python3
      
      import sys
      from ctypes import CDLL, \
          c_size_t
      
      
      DLL = "./libA.so"
      
      
      def main():
          lib_a = CDLL(DLL)
          func_a = lib_a.funcA
          func_a.restype = c_size_t
      
          ret = func_a()
          print("{:s} returned {:d}".format(func_a.__name__, ret))
      
      
      if __name__ == "__main__":
          print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
          main()
      

    Output:

    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
    code.py  defines.h  libA.c  libA.h  libB.c  libB.h  libC.c  libC.h
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c -L. -lC
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c -L. -lB
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
    code.py  defines.h  libA.c  libA.h  libA.so  libB.c  libB.h  libB.so  libC.c  libC.h  libC.so
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libC.so
            linux-vdso.so.1 =>  (0x00007ffdfb1f4000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56dcf23000)
            /lib64/ld-linux-x86-64.so.2 (0x00007f56dd4ef000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libB.so
            linux-vdso.so.1 =>  (0x00007ffc2e7fd000)
            libC.so => ./libC.so (0x00007fdc90a9a000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdc906d0000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fdc90e9e000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. ldd libA.so
            linux-vdso.so.1 =>  (0x00007ffd20d53000)
            libB.so => ./libB.so (0x00007fdbee95a000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdbee590000)
            libC.so => ./libC.so (0x00007fdbee38e000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fdbeed5e000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray
                     U charArray
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray
    0000000000201030 0000000000000003 D charArray
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py
    Python 3.5.2 (default, Nov 12 2018, 13:43:14)
    [GCC 5.4.0 20160609] on linux
    
    From C: [libA.c] (9) - [funcA]
    From C: [libB.c] (7) - [funcB]
    From C: [libC.c] (7) - [funcC]
    0 - A
    1 - B
    2 - C
    
    funcA returned 3
    

    But if your array is declared as static ([CPPReference]: C keywords: static) (and thus, as a consequence it can't be extern as in the example), then you're kind of toasted.

    @EDIT0: Extending the example so that it better fits the description.

    Since ldd doesn't show dependencies between the .sos, I'm going to assume that each is loaded dynamically.

    • utils.h:

      #pragma once
      
      #include <dlfcn.h>
      
      
      void *loadLib(char id);
      
    • utils.c:

      #include "defines.h"
      #include "utils.h"
      
      
      void *loadLib(char id) {
          PRINT_MSG_0();
          char libNameFormat[] = "lib%c.so";
          char libName[8];
          sprintf(libName, libNameFormat, id);
          int load_flags = RTLD_LAZY | RTLD_GLOBAL;  // !!! @TODO - @CristiFati: Note RTLD_LAZY: if RTLD_NOW would be here instead, there would be nothing left to do. Same thing if RTLD_GLOBAL wouldn't be specified. !!!
          void *ret = dlopen(libName, load_flags);
          if (ret == NULL) {
              char *err = dlerror();
              printf("Error loading lib (%s): %s\n", libName, (err != NULL) ? err : "(null)");
          }
          return ret;
      }
      

    Below is a modified version of libB.c. Note that the same pattern should also be applied to the original libA.c.

    • libB.c:

      #include "defines.h"
      #include "libB.h"
      #include "libC.h"
      #include "utils.h"
      
      
      size_t funcB() {
          PRINT_MSG_0();
          void *mod = loadLib('C');
          size_t ret = funcC();
          dlclose(mod);
          return ret;
      }
      

    Output:

    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
    code.py  defines.h  libA.c  libA.h  libB.c  libB.h  libC.c  libC.h  utils.c  utils.h
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libC.so libC.c utils.c
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libB.so libB.c utils.c
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> gcc -fPIC -shared -o libA.so libA.c utils.c
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ls
    code.py  defines.h  libA.c  libA.h  libA.so  libB.c  libB.h  libB.so  libC.c  libC.h  libC.so  utils.c  utils.h
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libA.so
            linux-vdso.so.1 =>  (0x00007ffe5748c000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4d9e3f6000)
            /lib64/ld-linux-x86-64.so.2 (0x00007f4d9e9c2000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libB.so
            linux-vdso.so.1 =>  (0x00007ffe22fe3000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe93ce8a000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fe93d456000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> ldd libC.so
            linux-vdso.so.1 =>  (0x00007fffe85c3000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2d47453000)
            /lib64/ld-linux-x86-64.so.2 (0x00007f2d47a1f000)
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libC.so | grep charArray
                     U charArray
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> nm -S libA.so | grep charArray
    0000000000201060 0000000000000003 D charArray
    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py
    Python 3.5.2 (default, Nov 12 2018, 13:43:14)
    [GCC 5.4.0 20160609] on linux
    
    Traceback (most recent call last):
      File "code.py", line 22, in <module>
        main()
      File "code.py", line 12, in main
        lib_a = CDLL(DLL)
      File "/usr/lib/python3.5/ctypes/__init__.py", line 347, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: ./libA.so: undefined symbol: funcB
    

    I believe that this reproduces the problem. Now, if you modify (the 1st part of) code.py to:

    #!/usr/bin/env python3
    
    import sys
    from ctypes import CDLL, \
        RTLD_GLOBAL, \
        c_size_t
    
    
    RTLD_LAZY = 0x0001
    
    DLL = "./libA.so"
    
    
    def main():
        lib_a = CDLL(DLL, RTLD_LAZY | RTLD_GLOBAL)
        func_a = lib_a.funcA
        func_a.restype = c_size_t
    
        ret = func_a()
        print("{:s} returned {:d}".format(func_a.__name__, ret))
    
    
    if __name__ == "__main__":
        print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
        main()
    

    you'd get the following output:

    [cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q053327620]> LD_LIBRARY_PATH=. python3 code.py
    Python 3.5.2 (default, Nov 12 2018, 13:43:14)
    [GCC 5.4.0 20160609] on linux
    
    From C: [libA.c] (11) - [funcA]
    From C: [utils.c] (6) - [loadLib]
    From C: [libB.c] (8) - [funcB]
    From C: [utils.c] (6) - [loadLib]
    From C: [libC.c] (7) - [funcC]
    0 - A
    1 - B
    2 - C
    
    funcA returned 3
    

    Notes:

    • It's very important that in C RTLD_LAZY | RTLD_GLOBAL are there. if RTLD_LAZY is replaced by RTLD_NOW, it won't work
      • Also, if RTLD_GLOBAL isn't specified, it won't work either. I didn't check whether there are other RTLD_ flags that could be specified instead of RTLD_GLOBAL for things to still work
    • Creating that wrapper library that deals with all libraries loading and initialization, would be a good thing (workaround), especially if you plan to use them from multiple places (that way, the whole process would happen in one place only). But, previous bullet would still apply
    • For some reason, ctypes doesn't expose RTLD_LAZY (and many other related flags, as a matter of fact). Defining it in the code.py, is kind of a workaround, and on different (Nix) platforms (flavors), its value might differ