Search code examples
pythonattributespython-internals

The __getattribute__ method and descriptors


according to this guide on python descriptors https://docs.python.org/howto/descriptor.html

method objects in new style classes are implemented using descriptors in order to avoid special casing them in attribute lookup.

the way I understand this is that there is a method object type that implements __get__ and returns a bound method object when called with an instance and an unbound method object when called with no instance and only a class. the article also states that this logic is implemented in the object.__getattribute__ method. like so:

def __getattribute__(self, key):
    "Emulate type_getattro() in Objects/typeobject.c"
    v = object.__getattribute__(self, key)
    if hasattr(v, '__get__'):
       return v.__get__(None, self)
    return v

however object.__getattribute__ is itself a method! so how is it bound to an object (without infinite recursion)? if it is special cased in the attribute lookup does that not defeat the purpose of removing the old style special casing?


Solution

  • Actually, in CPython the default __getattribute__ implementation is not a Python method, but is instead implemented in C. It can access object slots (entries in the C structure representing Python objects) directly, without bothering to go through the pesky attribute access routine.

    Just because your Python code has to do this, doesn't mean the C code has to. :-)

    If you do implement a Python __getattribute__ method, just use object.__getattribute__(self, attrname), or better still, super().__getattribute__(attrname) to access attributes on self. That way you won't hit recursion either.

    In the CPython implementation, the attribute access is actually handled by the tp_getattro slot in the C type object, with a fallback to the tp_getattr slot.

    To be exhaustive and to fully expose what the C code does, when you use attribute access on an instance, here is the full set of functions called:

    • Python translates attribute access to a call to the PyObject_GetAttr() C function. The implementation for that function looks up the tp_getattro or tp_getattr slot for your class.

    • The object type has filled the tp_getattro slot with the PyObject_GenericGetAttr function, which delegates the call to _PyObject_GenericGetAttrWithDict (with the *dict pointer set to NULL and the suppress argument set to 0). This function is your object.__getattribute__ method (a special table maps between the name and the slots).

    • This _PyObject_GenericGetAttrWithDict function can access the instance __dict__ object through the tp_dict slot, but for descriptors (including methods), the _PyType_Lookup function is used.

    • _PyType_Lookup handles caching and delegates to find_name_in_mro on cache misses; the latter looks up attributes on the class (and superclasses). The code uses direct pointers to the tp_dict slot on each class in the MRO to reference class attributes.

    • If a descriptor is found by _PyType_Lookup it is returned to _PyObject_GenericGetAttrWithDict and it calls the tp_descr_get function on that object (the __get__ hook).

    When you access an attribute on the class itself, instead of _PyObject_GenericGetAttrWithDict, the type->tp_getattro slot is instead serviced by the type_getattro() function, which takes metaclasses into account too. This version calls __get__ too, but leaves the instance parameter set to None.

    Nowhere does this code have to recursively call __getattribute__ to access the __dict__ attribute, as it can simply reach into the C structures directly.