Search code examples
pythonpython-c-api

Python C extension segfault


I'm venturing into C extensions for the first time, and am somewhat new to C as well. I've got a working C extension, however, if i repeatedly call the utility in python, I eventually get a segmentation fault: 11.

#include <Python.h>

static PyObject *getasof(PyObject *self, PyObject *args) {
    PyObject *fmap;
    long dt;

    if (!PyArg_ParseTuple(args, "Ol", &fmap, &dt))
        return NULL;

    long length = PyList_Size(fmap);

    for (int i = 0; i < length; i++) {
        PyObject *event = PyList_GetItem(fmap, i);
        long dti = PyInt_AsLong(PyList_GetItem(event, 0));
        if (dti > dt) {
             PyObject *output = PyList_GetItem(event, 1);
            return output;
        }
    }
    Py_RETURN_NONE;
 };

The function args are a time series (list of lists): ex [[1, 'a'], [5, 'b']] a time (long): ex 4

And it's supposed to iterate over the list of lists til it finds a value greater than the time given. Then return that value. As I mentioned, it correctly returns the answer, but if I call it enough times, it segfaults.

My gut feeling is that this has to do with reference counting, but I'm not familiar enough with the concept to know if this is the direct cause.

Any help would be appreciated.


Solution

  • "My gut feeling is that this has to do with reference counting..." Your instincts are correct.

    PyList_GetItem returns a borrowed reference, which means your function doesn't "own" a reference to the item. So there is a problem here:

            PyObject *output = PyList_GetItem(event, 1);
            return output;
    

    You don't own a reference to the item, but you return it to the caller, so the caller doesn't own a reference either. The caller will run into a problem if the item is garbage collected while the caller is still trying to use it. So you'll need to increase the reference count of the item before you return it:

            PyObject *output = PyList_GetItem(event, 1);
            Py_INCREF(output);
            return output;
    

    That assumes that PyList_GetItem(event, 1) doesn't fail! Except for PyArg_ParseTuple, you aren't checking the return values of the C API functions, which means you are assuming the input argument always has the exact structure that you expect. That's fine while you're testing code and figuring out how this works, but eventually you should be checking the return values of the C API functions for failure, and handling it appropriately.