Search code examples
pythonpython-3.xpython-c-api

Python C Extension Need to Py_INCREF a Borrowed Reference if not returning it to Python Land?


If you obtain a borrowed reference but do NOT return it to Python land, should you be doing a Py_INCREF when you obtain it and then a Py_DECREF once you are done with it, or is this superfluous?

For example, this trivial example gets a borrowed reference from PyDict_GetItem, does some stuff with the borrowed reference, and then returns None to Python land. Is there any need to Py_INCREF/Py_DECREF the borrowed reference while working with it here?

static PyObject PyFoo_bar(PyFoo *self, int field)
{
    int field = 0;

    PyFoo *child = NULL;

    if (!PyArg_ParseTuple(args, "i", &field) {
         return NULL;
    }

    child = PyDict_GetItem(self->children, field);

    // Long code omitted that works on child
    // Is it safe to do so without a Py_INCREF on child now followed by PY_DECREF later?

    // NOT returning child here - just return None
    Py_RETURN_NONE;
}

My worry here is that after calling PyDict_GetItem, the reference count of child can somehow drop to zero and then deallocate while I am working with it. Is that even possible? I don't think so, since the GIL isn't dropped here, but I am not sure.

On the other hand, I wonder if it's just best practice to do a Py_INCREF/Py_DECREF here. Maybe later on I drop the GIL, or later decide to return the child. Py_INCREF is cheap. For example, would it be better to do it this way:

child = PyDict_GetItem(self->children, field);
Py_INCREF(child);
// Long code omitted
Py_DECREF(child);
Py_RETURN_NONE;

Solution

  • It depends on what the intermediate "long code" does. If it executes Python code that you have no control over, then it's perfectly possible that that code accesses the self->children dict and deletes field, at which point yes, its refcount can drop to zero. So you'd want to protect against that case and add INCREF/DECREF.

    Note that any use of child will require reading the type object pointer, which sits adjacent to the reference count and the latter will thus have been loaded on the same cache line. With out-of-order execution, the INCREF/DECREF are basically free operations, so performance is no reason to leave them out.

    The best reason I can think of for not doing INCREF, is when the "long code" has multiple exit points (but does not execute arbitrary python code or touch self->children as explained above). You'd have to add DECREF for each exit, with a high risk of missing one and getting a hard-to-debug memory leak.