Search code examples
pythoninstance-variables

python attribute access with data descriptor


I've read some blogs and docs that when access an instance attribute obj.a:

  1. try accessing data descriptor (named a) in current class __dict__ and base class __dict__
  2. find a in obj.__dict__
  3. find non-data descriptor (named a) in current class __dict__ and base class __dict__
  4. find attribute (named a) in current class __dict__ and base class __dict__
  5. call __getattr__ if any
  6. raise AttributeError

But I found that this searching rule does not match the behavior of below codes:

class ADesc(object):
    def __init__(self, name):
        self._name = name

    def __get__(self, obj, objclass):
        print('get.....')
        return self._name + '  ' + str(obj) + '  ' + str(objclass)

    def __set__(self, obj, value):
        print('set.....')
        self._name = value


class A(object):
    dd_1 = ADesc('dd_1 in A')


class B(A):
    dd_1 = 'dd_1 in B'


if __name__ == '__main__':
    print(A.__dict__)
    # {'dd_1': <__main__.ADesc object at 0x10ed0d050>, '__dict__': <attribute '__dict__' of 'A' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None}

    print(B.__dict__)
    # {'dd_1': 'dd_1 in B', '__module__': '__main__', '__doc__': None}

    b = B()
    print(b.dd_1)  # dd_1 in B

I think the last print(b.dd_1) will invoke the __get__ in ADesc, because according to the 1st rule, __dict__ of base class A contains the attribute dd_1 that we're accessing, so that data descriptor should be called. So is the above access rule wrong or any other magic involved here ?


Solution

  • You misunderstood how descriptors are found in classes. Python will use the first such name in the class hierarchy. Once found, the search stops. B.dd_1 exists, so A.dd_1 is not considered.

    The documentation tells you about base classes for the case where B does not define dd_1; in that case B is searched, then A. But when B has an attribute dd_1, any further search is stopped.

    Note that the search order is set by the class MRO (method resolution order). Rather than distinguish between a search in the class __dict__, and separately, the __dict__ attributes of the base classes, you should see the search as:

    def find_class_attribute(cls, name):
        for c in cls.__mro__:
            if name in c.__dict__:
                return c.__dict__[name]
    

    The MRO (as embodied by the cls.__mro__ attribute) includes the current class object:

    >>> B.__mro__()
    (<class '__main__.B'>, <class '__main__.A'>, <class 'object'>)
    

    The relevant documentation is found in the datamodel reference; where Custom Classes states:

    Class attribute references are translated to lookups in this dictionary, e.g., C.x is translated to C.__dict__["x"] (although there are a number of hooks which allow for other means of locating attributes). When the attribute name is not found there, the attribute search continues in the base classes.

    The actual implementation for instance attributes works like this:

    • The class is located (type(instance))
    • The class __getattribute__ method is called (type(instance).__getattribute__(instance, name))
    • __getattribute__ scans the MRO to see if the name exists on the class and its base classes (find_class_attribute(self, name))
      • if there is such an object, and it is a data descriptor (has a __set__ or __delete__ method), this object is used, and the search stops.
      • if there is such an object but it is not a data descriptor, a reference is kept for later.
    • __getattribute__ looks for the name in instance.__dict__
      • if there is such an object, the search stops. The instance attribute is used.
    • There was no data descriptor found, and no attribute in the instance dict. But the search through the MRO may have located a non-data-descriptor object
      • if there is a reference to an object found in the MRO, it is used and the search stops.
    • If there is a __getattr__ method defined on the class (or a base class), it is called, and the result is used. Search stops.
    • an AttributeError is raised.