Search code examples
pythonpython-3.xpython-c-api

Does PyDict_SetItem increase the reference count of the key, and if so, where in the code does it take place?


TLDR: PyDict_SetItem increments the key and the value, but where in the code does this happen?

PyDict_SetItem makes a call to insertdict.

insertdict immediately performs Py_INCREF on both the key and the value. However, at the end of the success path, it then does a Py_DECREF on the key, (but not the value). There must be some part of this code that I am missing where it does an extra PY_INCREF on the key, before it does this PY_DECREF. My question is where and why does this extra PY_INCREF take place? Why is the initial Py_INCREF at the start of insertdict insufficient?

From this, it seems at first glance that PyDict_SetItem only increases the reference count of the value, but not the key. This is not true, of course. For example, in PyDict_SetItemString, which takes a char *, converts it into a PyObject via PyUnicode_FromString (which returns a new value), performs a Py_DECREF on that new value after calling PyDict_SetItem. If PyDict_SetItem does not increment the key, and PyDict_SetItemString decrements the key it just created, the program may eventually segfault. Given that doesn't happen, it seems like I am missing something here.


Finally, this code should prove that PyDict_SetItem increments both the key and the value, and that the caller should decref both the key and value unless they were borrowed references/ or will give the key and values to someone else.

#include <Python.h>
#include <stdio.h>

int main(void)
{
    Py_Initialize();
    PyObject *dict = NULL, *key = NULL, *value = NULL;
    int i = 5000;
    char *o = "foo";

    if (!(dict = PyDict_New())) {
        goto error;
    }
    if (!(key = Py_BuildValue("i", i))) {
        goto error;
    }
    if (!(value = Py_BuildValue("s", o))) {
        goto error;
    }
    printf("Before PyDict_SetItem\n");
    printf("key is %i\n", key->ob_refcnt);  /* Prints 1 */
    printf("value is %i\n", value->ob_refcnt);  /* Prints 1 */

    printf("Calling PyDict_SetItem\n");
    if (PyDict_SetItem(dict, key, value) < 0) {
        goto error;
    }
    printf("key is %i\n", key->ob_refcnt);  /* Prints 2 */
    printf("value is %i\n", value->ob_refcnt);  /* Prints 2 */

    printf("Decrefing key and value\n");
    Py_DECREF(key);
    Py_DECREF(value);
    printf("key is %i\n", key->ob_refcnt);   /* Prints 1 */
    printf("value is %i\n", value->ob_refcnt);   /* Prints 1 */

    Py_Finalize();
    return 0; // would return the dict in normal code
error:
    printf("error");
    Py_XDECREF(dict);
    Py_XDECREF(key);
    Py_XDECREF(value);
    Py_Finalize();
    return 1;
}

You can compile like:

gcc -c -I/path/to/python/include/python3.7m dict.c
gcc dict.o -L/path/to/python/lib/python3.7/config-3.7m-i386-linux-gnu -L/path/to/python/lib -Wl,-rpath=/path/to/python/lib -lpython3.7m -lpthread -lm -ldl -lutil -lrt -Xlinker -export-dynamic -m32

Solution

  • The Py_DECREF(key); in insertdict doesn't happen for all successes. It happens on a success when an equal key was already present, either because there was an existing entry or because the dict was a split-table dict sharing keys with other dicts that had that key. On that path, the provided key isn't inserted, so the original Py_INCREF(key); needs to be canceled.

    On the key-not-present path, insertdict hits a different return statement and doesn't decref the key:

    if (ix == DKIX_EMPTY) {
        /* Insert into new slot. */
        assert(old_value == NULL);
        if (mp->ma_keys->dk_usable <= 0) {
            /* Need to resize. */
            if (insertion_resize(mp) < 0)
                goto Fail;
        }
        Py_ssize_t hashpos = find_empty_slot(mp->ma_keys, hash);
        ep = &DK_ENTRIES(mp->ma_keys)[mp->ma_keys->dk_nentries];
        dictkeys_set_index(mp->ma_keys, hashpos, mp->ma_keys->dk_nentries);
        ep->me_key = key;
        ep->me_hash = hash;
        if (mp->ma_values) {
            assert (mp->ma_values[mp->ma_keys->dk_nentries] == NULL);
            mp->ma_values[mp->ma_keys->dk_nentries] = value;
        }
        else {
            ep->me_value = value;
        }
        mp->ma_used++;
        mp->ma_version_tag = DICT_NEXT_VERSION();
        mp->ma_keys->dk_usable--;
        mp->ma_keys->dk_nentries++;
        assert(mp->ma_keys->dk_usable >= 0);
        ASSERT_CONSISTENT(mp);
        return 0;
    }