Search code examples
pythonpython-3.xpython-internals

Sequence of constructor calls in complex multiple inheritance in python


I have a doubt in understanding statement

name = SizedRegexString(maxlen=8, pat='[A-Z]+$')

in the code below. I am not able to understand how init calls are happening up in the hierarchy.

# Example of defining descriptors to customize attribute access.

from inspect import Parameter, Signature
import re
from collections import OrderedDict


class Descriptor:
    def __init__(self, name=None):
        print("inside desc")
        self.name = name

    def __set__(self, instance, value):
        instance.__dict__[self.name] = value

    def __delete__(self, instance):
        raise AttributeError("Can't delete")


class Typed(Descriptor):
    ty = object

    def __set__(self, instance, value):
        if not isinstance(value, self.ty):
            raise TypeError('Expected %s' % self.ty)
        super().__set__(instance, value)


class String(Typed):
    ty = str


# Length checking
class Sized(Descriptor):
    def __init__(self, *args, maxlen, **kwargs):
        print("inside sized")
        self.maxlen = maxlen
        super().__init__(*args, **kwargs)

    def __set__(self, instance, value):
        if len(value) > self.maxlen:
            raise ValueError('Too big')
        super().__set__(instance, value)


class SizedString(String, Sized):
    pass


# Pattern matching
class Regex(Descriptor):
    def __init__(self, *args, pat, **kwargs):
        print("inside regex")
        self.pat = re.compile(pat)
        super().__init__(*args, **kwargs)

    def __set__(self, instance, value):
        if not self.pat.match(value):
            raise ValueError('Invalid string')
        super().__set__(instance, value)


class SizedRegexString(SizedString, Regex):
    pass


# Structure definition code
def make_signature(names):
    return Signature(
        Parameter(name, Parameter.POSITIONAL_OR_KEYWORD)
        for name in names)


class StructMeta(type):
    @classmethod
    def __prepare__(cls, name, bases):
        return OrderedDict()

    def __new__(cls, clsname, bases, clsdict):
        fields = [key for key, val in clsdict.items()
                  if isinstance(val, Descriptor) ]
        for name in fields:
            clsdict[name].name = name

        clsobj = super().__new__(cls, clsname, bases, dict(clsdict))
        sig = make_signature(fields)
        setattr(clsobj, '__signature__', sig)
        return clsobj


class Structure(metaclass=StructMeta):
    def __init__(self, *args, **kwargs):
        bound = self.__signature__.bind(*args, **kwargs)
        for name, val in bound.arguments.items():
            setattr(self, name, val)


if __name__ == '__main__':
    class Stock(Structure):
        name = SizedRegexString(maxlen=8, pat='[A-Z]+$')


    for item in SizedRegexString.__mro__:
        print(item)

Output from print statements inside init:

inside sized
inside regex
inside desc
inside desc
inside desc

Output from mro of SizedRegexString class

<class '__main__.SizedRegexString'>
<class '__main__.SizedString'>
<class '__main__.String'>
<class '__main__.Typed'>
<class '__main__.Sized'>
<class '__main__.Regex'>
<class '__main__.Descriptor'>
<class 'object'>

Does init and set both call chains follow the mro? Or there is something else happening here?


Solution

  • I’m not clear on what exactly your question is, so it would be helpful if you could explain precisely what you were expecting to happen, and how that differed from what actually happened. In light of that fact, I’ll try to explain how the MRO is evaluated here.

    First, since the class hierarchy in the example code is rather convoluted, it may help to visualize the inheritance structure:

    enter image description here

    Turning to your question,

    Does init and set both call chains follow the mro?

    If I’m understanding correctly, the short answer is yes. The MRO is determined based on class inheritance and is an attribute of classes, not methods. Your loop through SizedRegexString.__mro__ illustrates this fact, so I’m guessing your question arose from a perceived disparity between the call chains of __init__ and __set__.

    __init__ Call Chain

    The call chain for SizedRegexString.__init__ is as follows:

    • SizedRegexString.__init__, which is not explicitly defined, so it defers to its superclass’s definition
    • SizedString.__init__, which is not explicitly defined
    • String.__init__, which is not explicitly defined
    • Typed.__init__, which is not explicitly defined
    • Sized.__init__, which sets maxlen, then calls super().__init__()
    • Regex.__init__, which sets pat, then calls super().__init__()
    • Descriptor.__init__, which sets name

    So upon calling SizedRegexString.__init__, according to the MRO, there are seven defined classes that need to be checked for an __init__ method (assuming each calls super().__init__(), as well). However, as you noted, the output from print statements inside the __init__ methods shows that the following classes are visited: Sized, Regex, and Descriptor. Note that these are the same classes - in the same order - as those mentioned in the bullets above as being explicitly defined.

    So, to us, it might seem like the MRO for SizedRegexString is [Sized, Regex, Descriptor] because those are the only three classes we see actually doing things. However, this is not the case. The bulleted MRO above is still adhered to, but none of the classes before Sized explicitly define an __init__ method, so they each silently defer to their superclasses.

    __set__ Call Chain

    That explains how __init__ follows the MRO, but why does __set__ seem to behave differently? To answer this, we can follow the same bulleted MRO used above:

    • SizedRegexString.__set__, which is not explicitly defined, so it defers to its superclass’s definition
    • SizedString.__set__, which is not explicitly defined
    • String.__set__, which is not explicitly defined
    • Typed.__set__, which checks that value is an instance of self.ty, then calls super().__set__()
    • Sized.__set__, which checks the length of value, then calls super().__set__()
    • Regex.__set__, which ensures a match between self.pat and value, then calls super().__set__()
    • Descriptor.__set__, which adds the key/value pair of self.name and value to instance.__dict__

    The takeaway here is that __set__ adheres to the same MRO as __init__ because they belong to the same class, even though we see activity from four different classes this time, whereas, we only saw three with __init__. So, once again, it may seem as if the MRO of SizedRegexString is now [Typed, Sized, Regex, Descriptor]. This can be confusing because this new call chain differs from both SizedRegexString.__mro__ and from the apparent call chain we saw for SizedRegexString.__init__.

    TL;DR

    But after following the call chains for both __init__ and __set__, we can see that they both follow the MRO. The disparity comes from the fact that more descendants of Descriptor explicitly define the __set__ method than the __init__ method.

    Additional Points

    Here are a couple other points that may be causing some confusion:

    1. None of the __set__ methods defined are actually called in your example code’s current state. We can figure out why with the following two lines from your example code:

      class Stock(Structure):
          name = SizedRegexString(maxlen=8, pat=“[A-Z]+$”)
      

      The end product of these two lines (Stock) is produced by the StructMeta metaclass’s __new__ method. While Stock does have the name class attribute that is a SizedRegexString instance, no attributes of this instance are being set. Therefore, none of the __set__ methods are called. Where we do expect __set__ to be called is in Stock.__init__, because of the following lines in Structure.__init__:

      for n, v in bound.arguments.items():
          setattr(self, n, v)
      

      By adding s = Stock(name=“FOO”) to the end of your example code, we can see the __set__ methods executing successfully. Additionally, we can verify that the proper errors are raised by Regex.__set__ and Sized.__set__ with s = Stock(name=“foo”) and s = Stock(name=“FOOFOOFOO”), respectively

    2. After Python 3.6, dicts are ordered by default, so the __prepare__ method in StructMeta may be superfluous depending on which Python version you’re using

    Hopefully I addressed your question. If I missed the point completely, I’d be happy to try again if you could clarify exactly what you were expecting.