Search code examples
pythoncython

Calling methods using char** with Cython


I am trying to get some Cython bindings to work where the external C code uses parameters of type char**, as usually seen for main methods.

Unfortunately, all of my previous attempts failed and I could not find any resources on how this actually can be achieved. The existing solutions I was able to find usually refer to arrays of numbers or require rewriting the original code.

How can a method using char** parameters be called, preferably without having to modify the call semantics of the underlying C code I am interfacing?


Example

# File setup.py
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("my_test.pyx", language_level=3)
)
# File my_test.pyx
from binding cimport add as _add, main as _main


def add(a, b):
    return _add(a, b)


def main(argc, argv):
    cdef char[:, ::1] argv_array = [b'fixed', b'values'] + [x.encode() for x in argv]
    return _main(argc + 2, &argv_array[0][0])
# File binding.pxd
cdef extern from "module1.c":
    int add(int a, int b)
    int main(int argc, char** argv)
// File module1.c
#include <stdio.h>

static int add(int a, int b) {
    return a + b;
}


int main(int argc, char** argv) {
    printf("Result: %d\n", add(40, 2));
    for (int i = 0; i < argc; i++) {
        printf("%s\n", argv[i]);
    }
    return 0;
}

Error message

(venv) user@host ~/path/to/directory $ python setup.py build_ext --inplace
Compiling my_test.pyx because it changed.
[1/1] Cythonizing my_test.pyx

Error compiling Cython file:
------------------------------------------------------------
...
    return _add(a, b)


def main(argc, argv):
    cdef char[:, ::1] argv_array = [x.encode() for x in argv]
    return _main(argc, &argv_array[0][0])
                      ^
------------------------------------------------------------

my_test.pyx:12:23: Cannot assign type 'char *' to 'char **'
Traceback (most recent call last):
  File "setup.py", line 5, in <module>
    ext_modules = cythonize("my_test.pyx", language_level=3)
  File "/home/user/path/to/directory/venv/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1127, in cythonize
    cythonize_one(*args)
  File "/home/user/path/to/directory/venv/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1250, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: my_test.pyx

Declaring ctypedef char* cchar_tp and using it as cdef cchar_tp[:, ::1] argv_array will yield another error message:

Invalid base type for memoryview slice: cchar_tp


Solution

  • The problem you're facing is that a 2D memoryview/array is not a pointer to pointers (because that's generally an awful way of storing an array). Instead it's a single 1D array and some sizes defining the length of the dimensions. Note that char** (representing a "list" of strings) isn't quite the same as a 2D array since generally the strings are of different lengths.

    Therefore you must create a separate array of pointers, each of which can point into your larger array. This is discussed in this question, which I originally marked as a duplicate, and still think is probably a duplicate. The approach there should still work.

    You can take one shortcut with Python bytes objects - they can be assigned directly to a const char*. The pointer will just point into the Python-owned memory so the bytes object must outlive the C pointer. In this case I ensure it by stashing them safely in a list.

    from libc.stdlib cimport malloc, free
    
    cdef extern from *:
        """
        int m(int n, const char**) {
            return 1;
        }
        """
        int m(int n, const char**)
    
    def call_m():
        cdef const char** to_pass
        args = [b"arg1", b"arg2"]
        to_pass = <const char**>malloc(sizeof(const char*)*len(args))
        try:
            for n, a in enumerate(args):
                to_pass[n] = a  # use auto-conversion from Python bytes to char*
            m(len(args), to_pass)
        finally:
            free(to_pass)