Search code examples
pythonnestedyamlself-contained

Loading infinitely nested tuples from YAML


The following code generates an infinitely nested python list:

import yaml
i_list = yaml.load('&id1 [3, *id1]')
print i_list
# [3, [...]]
print i_list[1] is i_list
# True

I can also explicitly mention the python list type:

i_list = yaml.load('&id1 !!python/list [3, *id1]')

And I can also manually create that structure without parsing yaml, as follows:

i_list = [3]
i_list.append(i_list)

However, the last trick won't work for tuples, or any other immutable object. To create an infinitely-nested tuple, I must use CPython's API:

from ctypes import pythonapi
from _ctypes import PyObj_FromPtr

t = pythonapi.PyTuple_New(1)
pythonapi.PyTuple_SetItem(t, 0, t)
i_tup = PyObj_FromPtr(t)
print repr(i_tup)
# ((...),)

The expected yaml code for such a tuple would look like that:

&id001 !!python/tuple
- *id001

and indeed, this is the output of yaml.dump(i_tup). However, python's yaml can't load the very same code:

yaml.load(yaml.dump(i_tup))

ConstructorError: found unconstructable recursive node
  in "<string>", line 1, column 1:
    &id001 !!python/tuple
    ^

Any good reason why it is so? Any workaround you could suggest?


Solution

  • Tuples are simply not designed to do this. There's no way to build such a thing through the ordinary Python API, and even the C API that lets you get around it has a check (op->ob_refcnt != 1) that is very likely to break things if you try:

    int
    PyTuple_SetItem(register PyObject *op, register Py_ssize_t i, PyObject *newitem)
    {
        register PyObject *olditem;
        register PyObject **p;
        if (!PyTuple_Check(op) || op->ob_refcnt != 1) {
            Py_XDECREF(newitem);
            PyErr_BadInternalCall();
            return -1;
        }
        ...
    }
    

    If you try to set any items of a tuple with this function after putting in a self-reference, Python detects the self-reference as an error. Build tuples like this at your own risk, and don't be surprised if your code breaks for weird reasons.