I am trying to embed some rpython code into python script via ctypes. RPython program is fairly simple:
# check.py
from rpython.rlib.entrypoint import entrypoint_highlevel
from rpython.rtyper.lltypesystem import rffi
@entrypoint_highlevel(key='main', c_name='hello', argtypes=[rffi.LONGLONG])
def hello(value):
os.write(1, "hello world")
return 0
def main(args):
return 0
def target(*args):
return main, None
which compiled in a straight forward manner:
python /home/magniff/workspace/pypy3-v5.5.0-src/rpython/bin/rpython --shared check.py
yielding shared object:
(venv) magniff@magniffy700:~/workspace/rfplib $ ls -la libcheck-c.so
-rwxrwxr-x 1 magniff magniff 320112 июн 20 12:33 libcheck-c.so
So far so good, but when I am trying to run it with ctypes:
# script.py
import ctypes
l = ctypes.cdll.LoadLibrary("./libcheck-c.so")
l.hello(20)
It fails with a nasty segfault error:
(venv) magniff@magniffy700:~/workspace/rfplib $ gdb --args python script.py
(gdb) r
Starting program: /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/bin/python script.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff646da6e in pypy_g_write () from ./libcheck-c.so
(gdb) bt
#0 0x00007ffff646da6e in pypy_g_write () from ./libcheck-c.so
#1 0x00007ffff64560ee in hello () from ./libcheck-c.so
#2 0x00007ffff6698e40 in ffi_call_unix64 () from /usr/lib/x86_64-linux-gnu/libffi.so.6
#3 0x00007ffff66988ab in ffi_call () from /usr/lib/x86_64-linux-gnu/libffi.so.6
#4 0x00007ffff68a83df in _ctypes_callproc () from /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so
#5 0x00007ffff68acd82 in ?? () from /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so
#6 0x00000000004b0cb3 in PyObject_Call ()
#7 0x00000000004c9faf in PyEval_EvalFrameEx ()
#8 0x00000000004c2765 in PyEval_EvalCodeEx ()
#9 0x00000000004c2509 in PyEval_EvalCode ()
#10 0x00000000004f1def in ?? ()
#11 0x00000000004ec652 in PyRun_FileExFlags ()
#12 0x00000000004eae31 in PyRun_SimpleFileExFlags ()
#13 0x000000000049e14a in Py_Main ()
#14 0x00007ffff7810830 in __libc_start_main (main=0x49dab0 <main>, argc=2, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdd98) at ../csu/libc-start.c:291
#15 0x000000000049d9d9 in _start ()
I find this kind of weird, since os.write works just fine in stand alone compilation mode.
GDB segfault details:
(gdb) p $_siginfo._sifields._sigfault.si_addr $1 = (void *) 0x0
Hmm, okeeeey, null pointer deref I guess. Could it be because of garbage collector killed the string before os.write had a chance to actually print it?
And there is a disassembly for pypy_g_write:
0x00007ffff6695a20 <+0>: push %r14
0x00007ffff6695a22 <+2>: push %r13
0x00007ffff6695a24 <+4>: mov %rdi,%r14
0x00007ffff6695a27 <+7>: push %r12
0x00007ffff6695a29 <+9>: push %rbp
0x00007ffff6695a2a <+10>: lea 0x216acf(%rip),%rdi # 0x7ffff68ac500 <pypy_g_rpython_memory_gc_incminimark_IncrementalMiniMar>
0x00007ffff6695a31 <+17>: push %rbx
0x00007ffff6695a32 <+18>: mov %rsi,%rbx
0x00007ffff6695a35 <+21>: mov $0x4,%ebp
0x00007ffff6695a3a <+26>: sub $0x10,%rsp
0x00007ffff6695a3e <+30>: mov 0x10(%rsi),%r13
0x00007ffff6695a42 <+34>: callq 0x7ffff6683550 <pypy_g_IncrementalMiniMarkGC_can_move>
0x00007ffff6695a47 <+39>: test %al,%al
0x00007ffff6695a49 <+41>: jne 0x7ffff6695ae8 <pypy_g_write+200>
0x00007ffff6695a4f <+47>: lea 0x18(%rbx),%r12
0x00007ffff6695a53 <+51>: mov 0x216dae(%rip),%rax # 0x7ffff68ac808 <pypy_g_rpython_memory_gctypelayout_GCData+40>
0x00007ffff6695a5a <+58>: mov %r12,%rsi
0x00007ffff6695a5d <+61>: mov %r14,%rdi
0x00007ffff6695a60 <+64>: lea 0x8(%rax),%rdx
0x00007ffff6695a64 <+68>: mov %rdx,0x216d9d(%rip) # 0x7ffff68ac808 <pypy_g_rpython_memory_gctypelayout_GCData+40>
0x00007ffff6695a6b <+75>: mov %r13,%rdx
=> 0x00007ffff6695a6e <+78>: mov %rbx,(%rax) # segfault happens there
It seams that the problem lies in default GC (minimark), for some reason it frees memory a bit in advance. By setting --gc=ref
it runs correctly.
Thanks to Armin Rigo we get a solution - in order to run this code properly you should first initialize rpython internals, just call void rpython_startup_code(void)
before calling actual entry point and that should be it.