The CPython Tutorial defines a custom initializer for a custom type which has the following lines :
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
but the tutorial advises against this simpler but more nefarious version :
if (first) {
Py_XDECREF(self->first);
Py_INCREF(first);
self->first = first;
}
For the following reason :
But this would be risky. Our type doesn’t restrict the type of the first member, so it could be any kind of object. It could have a destructor that causes code to be executed that tries to access the first member; or that destructor could release the Global interpreter Lock and let arbitrary code run in other threads that accesses and modifies our object.
I understand the reason why a multithreaded programming can break the simple version. There can be a race condition where the new self->first
is set by one thread and freed by another thread (by way of Py_XDECREF
). This is avoided in the correct version because both threads will have had to set self->first
before freeing it.
The part I don't understand is this section :
It could have a destructor that causes code to be executed that tries to access the first member
If first
is an object that is getting garbage collected, its destructor would have to be called and it should have access to itself through the duration of the destructor. Why can this be dangerous?
Thank you!
Let's suppose we have:
custom = Custom(...) # global variable
class SomePyClass:
def __del__(self):
# access global variable
custom.__init__(1, 2, 3)
Let's suppose that custom.first
is an instance of SomePyClass
and that this instance gets destructed at the line Py_XDECREF(self->first);
.
The arbitrary code in the destructor is going to cause custom.__init__
to be called again. This will cause Py_XDECREF
to be called again on an object that really should have a refcount of 0 and which is already in the process of getting destroyed. That alone represents a reference counting error since it's ref-count will end up below 0 (note this may not be exactly what happens since it temporarily gets given a ref-count of 1 for the duration of the destructor, but it absolutely is a ref-counting error).
Note also the new value of first
that is being assigned to in __init__
is also mis-reference-counted: while our instance is decrefed twice, the value that is about to be assigned is replaced without decrefing it.