Simple repro:
class VocalDescriptor(object):
def __get__(self, obj, objtype):
print('__get__, obj={}, objtype={}'.format(obj, objtype))
def __set__(self, obj, val):
print('__set__')
class B(object):
v = VocalDescriptor()
B.v # prints "__get__, obj=None, objtype=<class '__main__.B'>"
B.v = 3 # does not print "__set__", evidently does not trigger descriptor
B.v # does not print anything, we overwrote the descriptor
This question has an effective duplicate, but the duplicate was not answered, and I dug a bit more into the CPython source as a learning exercise. Warning: i went into the weeds. I'm really hoping I can get help from a captain who knows those waters. I tried to be as explicit as possible in tracing the calls I was looking at, for my own future benefit and the benefit of future readers.
I've seen a lot of ink spilled over the behavior of __getattribute__
applied to descriptors, e.g. lookup precedence. The Python snippet in "Invoking Descriptors" just below For classes, the machinery is in type.__getattribute__()...
roughly agrees in my mind with what I believe is the corresponding CPython source in type_getattro
, which I tracked down by looking at "tp_slots" then where tp_getattro is populated. And the fact that B.v
initially prints __get__, obj=None, objtype=<class '__main__.B'>
makes sense to me.
What I don't understand is, why does the assignment B.v = 3
blindly overwrite the descriptor, rather than triggering v.__set__
? I tried to trace the CPython call, starting once more from "tp_slots", then looking at where tp_setattro is populated, then looking at type_setattro. type_setattro
appears to be a thin wrapper around _PyObject_GenericSetAttrWithDict. And there's the crux of my confusion: _PyObject_GenericSetAttrWithDict
appears to have logic that gives precedence to a descriptor's __set__
method!! With this in mind, I can't figure out why B.v = 3
blindly overwrites v
rather than triggering v.__set__
.
Disclaimer 1: I did not rebuild Python from source with printfs, so I'm not
completely sure type_setattro
is what's being called during B.v = 3
.
Disclaimer 2: VocalDescriptor
is not intended to exemplify "typical" or "recommended" descriptor definition. It's a verbose no-op to tell me when the methods are being called.
You are correct that B.v = 3
simply overwrites the descriptor with an integer (as it should). In the descriptor protocol, __get__
is designed to be called as instance attribute or class attribute, but __set__
is designed to be called only as instance attribute.
For B.v = 3
to invoke a descriptor, the descriptor should have been defined on the metaclass, i.e. on type(B)
.
>>> class BMeta(type):
... v = VocalDescriptor()
...
>>> class B(metaclass=BMeta):
... pass
...
>>> B.v = 3
__set__
To invoke the descriptor on B
, you would use an instance: B().v = 3
will do it.
The reason for B.v
also invoking the getter is to allow user's customization of what B.v
does, independently of whatever B().v
does. A common pattern is to allow direct access on the descriptor instance, by returning the descriptor itself when a class attribute access was used:
class VocalDescriptor(object):
def __get__(self, obj, objtype):
if obj is None:
return self
print('__get__, obj={}, objtype={}'.format(obj, objtype))
def __set__(self, obj, val):
print('__set__')
Now B.v
would return some instance like <mymodule.VocalDescriptor object at 0xdeadbeef>
which you can interact with. It is literally the descriptor object, defined as a class attribute, and its state B.v.__dict__
is shared between all instances of B
.
Of course it is up to user's code to define exactly what they want B.v
to do, returning self
is just the common pattern. A classmethod
is an example of a descriptor which does something different here, see the Descriptor HowTo Guide for a pure-python implementation of classmethod
.
Unlike __get__
, which can be used to customize B().v
and B.v
independently, __set__
is not invoked unless the attribute access is on an instance. I would suppose that the goal of customizing B().v = other
and B.v = other
using the same descriptor v
is not common or useful enough to complicate the descriptor protocol further, especially since the latter is still possible with a metaclass descriptor anyway, as shown in BMeta.v
above.