Search code examples
pythonpython-3.xinheritancesuperdiamond-problem

super() strange behavior in diamond inheritance in python


As the following example shows, super() has some strange (at least to me) behavior when used in diamond inheritance.

class Vehicle:
    def start(self):
        print("engine has been started")

class LandVehicle(Vehicle):
    def start(self):
        super().start()
        print("tires are safe to use")

class WaterCraft(Vehicle):
    def start(self):
        super().start()
        print("anchor has been pulled up")

class Amphibian(LandVehicle, WaterCraft):
    def start(self):
        # we do not want to call WaterCraft.start, amphibious
        # vehicles don't have anchors
        LandVehicle.start(self)
        print("amphibian is ready for travelling on land")

amphibian = Amphibian()
amphibian.start()

The above code produces the following output:

engine has been started
anchor has been pulled up
tires are safe to use
amphibian is ready for travelling on land

When I call super().some_method(), I would never expect a method of a class on the same inheritance level to be called. So in my example, I would not expect anchor has been pulled up to appear in the output.

The class calling super() may not even know about the other class whose method is eventually called. In my example, LandVehicle may not even know about WaterCraft.

Is this behavior normal/expected and if so, what is the rationale behind it?


Solution

  • When you use inheritance in Python, each class defines a Method Resolution Order (MRO) that is used to decide where to look when a class attribute is looked up. For your Amphibian class, for instance, the MRO is Amphibian, LandVehicle, WaterCraft, Vehicle and finally object. (You can see this for yourself by calling Amphibian.mro().)

    The exact details of how the MRO is derived are a little complicated (though you can find a description of how it works if you're interested). The important thing to know is that any child class is always listed before its parent classes, and if multiple inheritance is going on, all the parents of a child class will be in the same relative order they are in the class statement (other classes may appear in between the parents, but they'll never be reversed relative to one another).

    When you use super to call an overridden method, it looks though the MRO like it does for any attribute lookup, but it starts its search further along than usual. Specifically, it starts to search for an attribute just after the "current" class. By "current" I mean, the class containing the method in which the super call (even if the object the method is being called on is of some other more derived class). So when LandVehicle.__init__ calls super().__init__, it begins checking for an __init__ method in the the first class after LandVehicle in the MRO, and finds WaterCraft.__init__.

    This suggests one way you could fix the issue. You could have Amphibian name WaterCraft as its first base class, and LandVehicle second:

    class Amphibian(Watercraft, LandVehicle):
        ...
    

    Changing the order of the bases will also change their order in the MRO. When Amphibian.__init__ calls LandVehicle.__init__ directly by name (rather than using super), the subsequent super calls will skip over WaterCraft since the class they're being called from is already further along in the MRO. Thus the rest of the super calls will work as you intended.

    But that's not really a great solution. When you explicitly name a base class like that, you may find that it breaks things later if you have your more child classes that want to do things differently. For instance, a class derived from the reordered-base Amphibian above might end up with other base classes in between WaterCraft and LandVehcle, which would also have their __init__ methods skipped accidentally when Amphibian.__init__ calls LandVehcle.__init__ directly.

    A better solution would be to allow all the __init__ methods to get called in turn, but to factor out the parts of them you might not want to always run into other methods that can be separately overridden.

    For example, you could change WaterCraft to:

    class WaterCraft(Vehicle):
        def start(self):
            super().start()
            self.weigh_anchor()
    
        def weigh_anchor(self):
            print("anchor has been pulled up")
    

    The Amphibian class could override the anchor specific behavior (e.g. to do nothing):

    class Amphibian(LandVehicle, WaterCraft):
        def start(self):
            super().start(self)
            print("amphibian is ready for travelling on land")
    
        def weigh_anchor(self):
            pass # no anchor to weigh, so do nothing
    

    Of course in this specific case, where WaterCraft doesn't do anything other than raise its anchor, it would be even simpler to remove WaterCraft as a base class for Amphibian. But the same idea can often work for non-trivial code.