Search code examples
pythonclasspython-internals

How to access instance dictionary after overriding __dict__ attribute on its class?


Consider this example where the __dict__ of all instances of a class A will point to a global dict shared.

shared = {'a': 1, 'b': 2}

class A(object):
    def __init__(self):
        self.__dict__ = shared

Now let's test a few things:

>>> a = A()
>>> b = A()
>>> a.a, a.b, b.a, b.b
(1, 2, 1, 2)
>>> b.x = 100
>>> shared
{'a': 1, 'x': 100, 'b': 2}
>>> a.x
100
>>> c = A()
>>> c.a, c.b, c.x
(1, 2, 100)
>>> shared['foo'] = 'bar'
>>> a.foo, b.foo, c.foo
('bar', 'bar', 'bar')
>>> a.__dict__, b.__dict__, c.__dict__
({'a': 1, 'x': 100, 'b': 2, 'foo': 'bar'},
 {'a': 1, 'x': 100, 'b': 2, 'foo': 'bar'},
 {'a': 1, 'x': 100, 'b': 2, 'foo': 'bar'}
)

All works as expected.


Now let's tweak class A a little by adding an attribute named __dict__.

shared = {'a': 1, 'b': 2}

class A(object):
    __dict__ = None
    def __init__(self):
        self.__dict__ = shared

Let's run the same set of steps again:

>>> a = A()
>>> b = A()
>>> a.a, a.b, b.a, b.b
AttributeError: 'A' object has no attribute 'a'
>>> b.x = 100
>>> shared
{'a': 1, 'b': 2}
>>> b.__dict__  # What happened to x?
{'a': 1, 'b': 2}
>>> a.x
AttributeError: 'A' object has no attribute 'x'
>>> c = A()
>>> c.a, c.b, c.x
AttributeError: 'A' object has no attribute 'a'
>>> shared['foo'] = 'bar'
>>> a.foo, b.foo, c.foo
AttributeError: 'A' object has no attribute 'foo'
>>> a.__dict__, b.__dict__, c.__dict__
({'a': 1, 'b': 2, 'foo': 'bar'},
 {'a': 1, 'b': 2, 'foo': 'bar'},
 {'a': 1, 'b': 2, 'foo': 'bar'}
)
>>> b.x  # Where did this come from?
100

Based on the above information the first case worked as expected but the second one didn't and hence I would like to know what changed after the adding class level __dict__ attribute. And can we access the instance dictionary being used now in any way?


Solution

  • In the first case the self.__dict__ has access to the __dict__ descriptor provided by its type. This descriptor allows it to get the underlying instance dictionary and also set it to a new one using PyObject_GenericGetDict and PyObject_GenericSetDict respectively.

    >>> A.__dict__
    mappingproxy(
    {'__module__': '__main__',
     '__init__': <function A.__init__ at 0x1041fb598>,
     '__dict__': <attribute '__dict__' of 'A' objects>,
     '__weakref__': <attribute '__weakref__' of 'A' objects>, '__doc__': None
    })
    >>> A.__dict__['__dict__'].__get__(a)
    {'a': 1, 'b': 2}
    

    And of course we can set a new dictionary from here as well:

    >>> new_dict = {}
    >>> A.__dict__['__dict__'].__set__(a, new_dict)  # a.__dict__ = new_dict
    >>> a.spam = 'eggs'
    >>> a.__dict__
    {'spam': 'eggs'}
    >>> new_dict
    {'spam': 'eggs'}
    >>> b = A()  # Points to `shared`
    >>> b.__dict__
    {'a': 1, 'b': 2}
    

    In the second case our class itself contains an attribute named __dict__, but still the __dict__ attribute points to mappingproxy.

    >>> A.__dict__
    mappingproxy(
    {'__module__': '__main__',
     '__dict__': None,
     '__init__': <function A.__init__ at 0x1041cfae8>,
     '__weakref__': <attribute '__weakref__' of 'A' objects>,
     '__doc__': None}
    )
    

    __dict__ attribute for classes in this way is a special attribute.

    >>> A.__weakref__ is A.__dict__['__weakref__']
    True    
    >>> A.__weakref__ = 1    
    >>> A.__weakref__, A.__dict__['__weakref__']
    (1, 1)
    
    >>> A.__dict__ = {}    
    AttributeError: attribute '__dict__' of 'type' objects is not writable
    

    The attribute we had set can be accessed like this:

    >>> repr(A.__dict__['__dict__'])
    'None'
    

    A Python level we have now lost access to the instance dictionary but internally a class can find it using tp_dictoffset. As done in _PyObject_GetDictPtr.

    Both __getattribute__ and __setattr__ also access the underlying instance dictionary using _PyObject_GetDictPtr.

    To access the instance dictionary being used we can actually implement _PyObject_GetDictPtr in Python using ctypes. This is pretty eloquently done by @user4815162342 here.

    import ctypes
    
    def magic_get_dict(o):
        # find address of dict whose offset is stored in the type
        dict_addr = id(o) + type(o).__dictoffset__
    
        # retrieve the dict object itself
        dict_ptr = ctypes.cast(dict_addr, ctypes.POINTER(ctypes.py_object))
        return dict_ptr.contents.value
    

    Continuing the second case:

    >>> magic_get_dict(a)
    {'__dict__': {'a': 1, 'b': 2, 'foo': 'bar'}}  # `a` has only one attribute i.e. __dict__
    >>> magic_get_dict(b)
    {'__dict__': {'a': 1, 'b': 2, 'foo': 'bar'}, 'x': 100}  # `x` found
    >>> magic_get_dict(b).update(shared)
    >>> b.a, b.b, b.foo, b.x
    (1, 2, 'bar', 100)