I'm using the Python/C API to call Python functions from C++. My problem is that when I call a Python function that in turn imports scipy.optimize.least_squares
then it hangs. Here are the details...
I'm calling my Python function testfunc1(foo,bar=True)
in module test_clib.py
as follows:
PyObject* pTestModuleName = PyUnicode_FromString( "test_clib" );
PyObject* pTestModule = PyImport_Import( pTestModuleName );
PyObject* pFunction = PyObject_GetAttrString( pTestModule, "testfunc1" );
const char* str = "foo";
PyObject* pArgList = Py_BuildValue( "(s)", str );
PyObject* pKeywords = PyDict_New();
PyDict_SetItemString( pKeywords, "bar", Py_True );
PyObject* pReturn = PyObject_Call( pFunction, pArgList, pKeywords );
When testfunc1()
imports scipy.optimize.least_squares
then it will hang. It doesn't even have to call least_squares. It will hang on this line:
from scipy.optimize import least_squares
But, when I boil it down to just a simple test program like I've shown here, it works. Where it fails is when the above snippet is part of my larger program.
So I realize this is not going to be something that someone else can directly try but maybe someone can spot something that I'm missing.
Maybe this will be helpful: when I run the test program in gdb it prints that about a dozen threads are started but my simple test program has no threads, so all of those must be from the C/Python API. When I try to run gdb on my larger program which makes the same Python calls, I don't see all those threads starting, it just hangs; when I interrupt it, it breaks here:
^C
Thread 18 "acamd" received signal SIGINT, Interrupt.0x00007ffff73ca7e8 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) where
#0 0x00007ffff73ca7e8 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib6/libpthread.so.0
#1 0x00007ffff793bef5 in take_gil () from /lib64/libpython3.9.so.1.0
#2 0x00007ffff793c112 in PyEval_RestoreThread () from /lib64/libpython3.9.so.1.0
#3 0x00007ffff7a5dd8c in PyGILState_Ensure () from /lib64/libpython3.9.so.1.0
#4 0x00007fffb83bd6c5 in pybind11::detail::get_internals() () from /usr/local/lib64/python3.9/site-packages/scipy/spatial/_distance_pybind.cpython-39-x86_64-linux-gnu.so
#5 0x00007fffb83ae22c in PyInit__distance_pybind () from /usr/local/lib64/python3.9/site-packages/scipy/spatial/_distance_pybind.cpython-39-x86_64-linux-gnu.so
#6 0x00007ffff7a765bc in _imp_create_dynamic () from /lib64/libpython3.9.so.1.0
#7 0x00007ffff7989ac8 in cfunction_vectorcall_FASTCALL () from /lib6/libpython3.9.so.1.0
#8 0x00007ffff799a8eb in PyObject_Call () from /lib64/libpython3.9.so.1.0
#9 0x00007ffff7a089d8 in _PyEval_EvalFrameDefault () from /lib64/libpython3.9.so.1.0
#10 0x00007ffff79f084e in _PyEval_EvalCode () from /lib64/libpython3.9.so.1.0
#11 0x00007ffff 79f1e2b in _PyFunction_Vectorcall () from /lib64/libpython3.9.so.1.0
: : :
which continues on for some 200 more lines.
the traceback (specifically PyEval_RestoreThread
) indicates that the thread is stuck trying to reclaim the GIL (global interpreter lock).
things that can lead up to this point.