I managed to implement a Fisher–Yates shuffle function for python lists as an exercise for getting used to extending python. It works perfectly for relatively small lists, unless I run the function several times.
Whenever the list size goes over about 100, I get all kinds of memory problems:
>>>import evosutil
>>> a=[i for i in range(100)]
>>> evosutil.shuffle(a)
>>> a
[52, 66, 0, 58, 41, 18, 50, 37, 81, 43, 74, 49, 90, 20, 63, 32, 89, 60, 2, 44, 3, 80, 15, 24, 22, 69, 86, 31, 56, 68, 34, 13, 38, 26, 14, 91, 73, 79, 39, 65, 5, 75, 84, 55, 7, 53, 93, 42, 40, 9, 51, 82, 29, 30, 99, 64, 33, 97, 27, 11, 6, 67, 16, 94, 95, 62, 57, 17, 78, 77, 71, 98, 72, 8, 88, 36, 85, 59, 21, 96, 23, 46, 10, 12, 48, 83, 4, 92, 45, 54, 1, 25, 19, 70, 35, 61, 47, 28, 87, 76]
>>> (Ctrl-D)
*** Error in `python3': free(): invalid next size (fast): 0x083fe680 ***
Or, when trying to operate on a list with 1000 elements:
*** Error in `python3': munmap_chunk(): invalid pointer: 0x083ff0e0 ***
Or,
Segmentation fault (core dumped)
Here's my code for the module that produces the error:
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
PyObject* tmp=PyList_GetItem(list, i2);
PyList_SetItem(list, i2, PyList_GetItem(list, i1));
PyList_SetItem(list, i1, tmp);
}
//Naive Fisher–Yates shuffle
static PyObject* shuffle(PyObject* self, PyObject* args){
PyObject* list;
PyArg_ParseTuple(args,"O", &list);
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::minstd_rand0 rand(seed);
Py_ssize_t size = PyList_Size(list);
for(int i=0; i<size;++i){
int randIndex = rand()%size;
_List_SwapItems(list, randIndex, i);
}
Py_RETURN_NONE;
}
I feel like I should be able to solve this either with free() or with Py_DECREF() somewhere, but I don't see where. I don't think I'm creating any objects, just moving them around. So where is the memory problem coming from?
You need to Py_XINCREF()
both objects before passing them to PyList_SetItem()
. Further, catch the special case where i1 == i2
:
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
if (i1 == i2) {
return;
}
PyObject* obj1=PyList_GetItem(list, i1);
PyObject* obj2=PyList_GetItem(list, i2);
Py_XINCREF(obj1);
Py_XINCREF(obj2);
PyList_SetItem(list, i2, obj1);
PyList_SetItem(list, i1, obj2);
}
PyList_GetItem()
returns a borrowed reference, i.e. it doesn't INCREF
the object it returns. If you don't hold any other references, the refcount will be 1
(as it is only referenced from the list). When you call PyList_SetItem(list, i2, ...)
, the list Py_XDECREF()
's the object previously stored at i2
(which you keep in tmp
). At that point, the refcount reaches 0
and the object is freed. Whoops.
Similarly, you can't just call PyList_SetItem(list, i, PyList_GetItem())
, because SetItem
steals the reference you pass to it. You don't own the reference, however, the 'old' list does. So you need an Py_XINCREF
here as well.
See the list API documentation for more details.
As a further suggestion, you might consider not programming directly against the Python extension API. It takes a lot of code to get anything done and it's subtly difficult to keep the refcounts correct. By now, there are multiple other ways to interface Python with C or C++. CFFI seems to be the low-level interface the Python ecosystem will standardize on. SIP and SWIG may offer better support for C++, though. For a SIP example, see this answer.