Search code examples
pythoncrefcounting

Why is my Python C Extension leaking memory?


The function below takes a python file handle, reads in packed binary data from the file, creates a Python dictionary and returns it. If I loop it endlessly, it'll continually consume RAM. What's wrong with my RefCounting?

static PyObject* __binParse_getDBHeader(PyObject *self, PyObject *args){

PyObject *o; //generic object
PyObject* pyDB = NULL; //this has to be a py file object

if (!PyArg_ParseTuple(args, "O", &pyDB)){
    return NULL;
} else {
    Py_INCREF(pyDB);
    if (!PyFile_Check(pyDB)){
        Py_DECREF(pyDB);
        PyErr_SetString(PyExc_IOError, "argument 1 must be open file handle");
        return NULL;
    }
}

FILE *fhDB = PyFile_AsFile(pyDB);

long offset = 0;
DB_HEADER *pdbHeader = malloc(sizeof(DB_HEADER));
fseek(fhDB,offset,SEEK_SET); //at the beginning
fread(pdbHeader, 1, sizeof(DB_HEADER), fhDB );
if (ferror(fhDB)){
    fclose(fhDB);
    Py_DECREF(pyDB);
    PyErr_SetString(PyExc_IOError, "failed reading database header");
    return NULL;
}
Py_DECREF(pyDB);

PyObject *pyDBHeader = PyDict_New();
Py_INCREF(pyDBHeader);

o=PyInt_FromLong(pdbHeader->version_number);
PyDict_SetItemString(pyDBHeader, "version", o);
Py_DECREF(o);

PyObject *pyTimeList = PyList_New(0);
Py_INCREF(pyTimeList);

int i;
for (i=0; i<NUM_DRAWERS; i++){
    //epochs
    o=PyInt_FromLong(pdbHeader->last_good_test[i]);
    PyList_Append(pyTimeList, o);
    Py_DECREF(o);
}
PyDict_SetItemString(pyDBHeader, "lastTest", pyTimeList);
Py_DECREF(pyTimeList);

o=PyInt_FromLong(pdbHeader->temp);
PyDict_SetItemString(pyDBHeader, "temp", o);
Py_DECREF(o);

free(pdbHeader);
return (pyDBHeader);
}

Solution

  • PyDict_New() returns a new reference, check the docs for PyDict. So if you increase the refcount immediately after creating it, you have two references to it. One is transferred to the caller when you return it as a result value, but the other one never goes aways.

    You also don't need to incref pyTimeList. It's yours when you create it. However, you need to decref it, but you only decref it once, so it's leaked as well.

    You also don't need to call Py_INCREF on pyDB. It's a borrowed reference and it won't go away as long as your function does not return, because it's still referenced in a lower stack frame.

    Only if you want to keep the reference in another structure somewhere, you need to increse the refcount.

    Cf. the API docs