Search code examples
pythonboost-pythonpypycppyy

How to use cppyy to embed Python instead of boost-python


I am currently using boost-python to embed a Python interpreter into my C++ application and facilitate passing data from the executed Python process to the running C++ application through the boost-python bindings, as per https://www.boost.org/doc/libs/1_75_0/libs/python/doc/html/tutorial/tutorial/embedding.html

I have some troubles with respect to performance, especially when calling wrapped functions with a large number of arguments, the overhead of parsing and boxing all those arguments for "passing" to the C++ "side" is considerable.

I checked alternatives to boost-python, for example pybind11, which can also be embedded, but the performance is unlikely to improve. I also found out about cppyy, but from the documentation I am at loss as how to facilitate embedding an interpreter into my program, or rather, how I should convert my current approach of an embedded interpreter to be able to use cppyy. My aim in trying out cppyy is to check whether using cppyy and/or PyPy as an interpreter can increase the performance of my code, since neither boost-python nor pybind11 support embedding PyPy.

Can anyone give any pointers in how to replace an embedded Python interpreter using boost-python with cppyy?


Solution

  • The cppyy embedding interface isn't documented yet because it's not functional on PyPy/cppyy (which seems to be what you're asking for most specifically), only for CPython. And for the latter, I don't necessarily see whether/how it would be faster than boost.python or pybind11, as it still relies on boxing variables and the C-API to call into Python. Potentially, C++ type lookups are faster, but that would be all.

    You can play with it rather easily to get some performance numbers first by calling into Python from C++ (Cling) from cppyy and see how it looks. Here's a trivial example:

    import cppyy
    import time
    
    N = 10000000
    
    def pycall(a):
        return a
    
    cppyy.cppdef("""\
    int (*ptr)(int) = 0;
    void func(uint64_t N) {
        for (uint64_t i = 0; i < N; ++i)
            ptr(1);
    }""")
    
    cppyy.gbl.ptr = pycall
    
    ts = time.perf_counter()
    cppyy.gbl.func(N)
    print('time per call:', (time.perf_counter()-ts)/N)
    

    To use your own code and types, rather than int, just include headers with cppyy.include and load libraries with cppyy.load_library(). Both Cling on the C++ side and cppyy on the Python side will then have full access, so you can use the types in the callback.

    If the numbers look better, the main pieces you need are in CPyCppyy/API.h, see here: https://github.com/wlav/CPyCppyy/blob/master/include/CPyCppyy/API.h

    My best recommendation, at this point in time, would however have to be CFFI, which is C only, but will give you callbacks that you can use directly and that are JIT friendly on PyPy: https://cffi.readthedocs.io/en/latest/embedding.html