I am trying to get some Cython bindings to work where the external C code uses parameters of type char**
, as usually seen for main
methods.
Unfortunately, all of my previous attempts failed and I could not find any resources on how this actually can be achieved. The existing solutions I was able to find usually refer to arrays of numbers or require rewriting the original code.
How can a method using char**
parameters be called, preferably without having to modify the call semantics of the underlying C code I am interfacing?
# File setup.py
from setuptools import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize("my_test.pyx", language_level=3)
)
# File my_test.pyx
from binding cimport add as _add, main as _main
def add(a, b):
return _add(a, b)
def main(argc, argv):
cdef char[:, ::1] argv_array = [b'fixed', b'values'] + [x.encode() for x in argv]
return _main(argc + 2, &argv_array[0][0])
# File binding.pxd
cdef extern from "module1.c":
int add(int a, int b)
int main(int argc, char** argv)
// File module1.c
#include <stdio.h>
static int add(int a, int b) {
return a + b;
}
int main(int argc, char** argv) {
printf("Result: %d\n", add(40, 2));
for (int i = 0; i < argc; i++) {
printf("%s\n", argv[i]);
}
return 0;
}
(venv) user@host ~/path/to/directory $ python setup.py build_ext --inplace
Compiling my_test.pyx because it changed.
[1/1] Cythonizing my_test.pyx
Error compiling Cython file:
------------------------------------------------------------
...
return _add(a, b)
def main(argc, argv):
cdef char[:, ::1] argv_array = [x.encode() for x in argv]
return _main(argc, &argv_array[0][0])
^
------------------------------------------------------------
my_test.pyx:12:23: Cannot assign type 'char *' to 'char **'
Traceback (most recent call last):
File "setup.py", line 5, in <module>
ext_modules = cythonize("my_test.pyx", language_level=3)
File "/home/user/path/to/directory/venv/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1127, in cythonize
cythonize_one(*args)
File "/home/user/path/to/directory/venv/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1250, in cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: my_test.pyx
Declaring ctypedef char* cchar_tp
and using it as cdef cchar_tp[:, ::1] argv_array
will yield another error message:
Invalid base type for memoryview slice: cchar_tp
The problem you're facing is that a 2D memoryview/array is not a pointer to pointers (because that's generally an awful way of storing an array). Instead it's a single 1D array and some sizes defining the length of the dimensions. Note that char**
(representing a "list" of strings) isn't quite the same as a 2D array since generally the strings are of different lengths.
Therefore you must create a separate array of pointers, each of which can point into your larger array. This is discussed in this question, which I originally marked as a duplicate, and still think is probably a duplicate. The approach there should still work.
You can take one shortcut with Python bytes objects - they can be assigned directly to a const char*
. The pointer will just point into the Python-owned memory so the bytes object must outlive the C pointer. In this case I ensure it by stashing them safely in a list.
from libc.stdlib cimport malloc, free
cdef extern from *:
"""
int m(int n, const char**) {
return 1;
}
"""
int m(int n, const char**)
def call_m():
cdef const char** to_pass
args = [b"arg1", b"arg2"]
to_pass = <const char**>malloc(sizeof(const char*)*len(args))
try:
for n, a in enumerate(args):
to_pass[n] = a # use auto-conversion from Python bytes to char*
m(len(args), to_pass)
finally:
free(to_pass)