Can't resize a tuple using ctypes.pythonapi

Only for testing, I tried to resize a tuple using ctypes, with terrible results:

Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import py_object, c_long, pythonapi
>>> _PyTuple_Resize = pythonapi._PyTuple_Resize
>>> _PyTuple_Resize.argtypes = (py_object, c_long)
>>> a = ()
>>> b = c_long(1)
>>> _PyTuple_Resize(a, b)
Segmentation fault (core dumped)

What was going wrong?

Solution

There are some issues with your code.

Let's start with the signature of _PyTuple_Resize, it is

int _PyTuple_Resize(PyObject **p, Py_ssize_t newsize)

i.e. the first arguments isn't a py_object (which would be PyObject *p), but a py_object passed by reference, that means:

from ctypes import POINTER, py_object, c_ssize_t, byref, pythonapi
_PyTuple_Resize = pythonapi._PyTuple_Resize
_PyTuple_Resize.argtypes = (POINTER(py_object), c_ssize_t)

However, there is no need to define the arguments of _PyTuple_Resize (as any other pythonapi-function), one only has to define restype if it is not int (but it is in case of _PyTuple_Resize).

Then, the above linked documentation states:

Because tuples are supposed to be immutable, this should only be used if there is only one reference to the object. Do not use this if the tuple may already be known to some other part of the code.

Well, the empty tuple is quite known to other parts of the code:

import sys
a=()
sys.getrefcount(a)
# 28236

As @CristiFati has pointed out in comments this is a small optimization, which can be done because tuples are immutable: all empty tuples share the same singleton. So using _PyTuple_Resize on empty tuple is quite problematic, even if this corner case is caught in the code of _PyTuple_Resize:

if (oldsize == 0) {
    /* Empty tuples are often shared, so we should never
       resize them in-place even if we do own the only
       (current) reference */
    Py_DECREF(v);
    *pv = PyTuple_New(newsize);
    return *pv == NULL ? -1 : 0;
}

However, my point is that one has to make sure, that there are no other references before calling _PyTuple_Resize.

Now, even if using _PyTuple_Resize for a tuple which is unknown to other parts of program:

b = c_ssize_t(2)
A=py_object(("no one knows me",))
pythonapi._PyTuple_Resize(byref(A), b) # returns 0 - means everything ok

We get an object which is in an inconsistent state:

print(A)
# py_object(('no one knows me', <NULL>))

The problem is the NULL-pointer as second element: now many of operations (like print(A.value)) with A.value will segfault or lead to other problems.

So now, one needs to use PyTuple_SetItem (it handles the NULL elements correctly and doesn't try to decrease reference of a NULL-pointer) to set NULL-elements in the tuple before anything can be done with A.value. Btw. usually one would use PyTuple_SET_ITEM for newly created tuple/elements, but it is a define and thus not part of pythonapi.

As PyTuple_SetItem steals a reference, so we need to take care of it as well:

B=py_object(666)
pythonapi.Py_IncRef(B)
pythonapi.PyTuple_SetItem(A,1,B)
print(A.value)
# ('no one knows me', 666)

For small tuples _PyTuple_Resize will always (for 64bit-builds) create a new tuple-object and not reuse the old one, because adding an element means adding 8 bytes to the memory footprint (at least for 64bit-builds) and pymalloc returns 8byte-aligned pointers, thus differently to adding chars to string, a new object will be needed:

b = c_ssize_t(2)
A=py_object(("no one knows me",))
print(id(A.value))
# 2311126190344
pythonapi._PyTuple_Resize(byref(A), b)
print(id(A.value))
# 2311143455304

We see the different ids!

However, for tuple-objects with memory-footprint larger than 512 bytes the memory is managed by the underlying c-runtime memory allocator and and thus resizing of the pointer is possible:

b = c_ssize_t(1002)
A=py_object(("no one knows me",)*1000)
print(id(A.value))
# 2350988176984
pythonapi._PyTuple_Resize(byref(A), b)
print(id(A.value))
# 2350988176984

Now, the old object is extended - and the id is kept!