Search code examples
pythontypesimmutabilitysubclassingbinary-operators

How can I make my Python `set` and `frozenset` subclasses preserve their types when engaging in binary operations?


I have some set and frozenset subclasses, OCDSet and OCDFrozenSet respectively. When I use them together with instances of their ancestor classes in binary operations, the ancestor classes dominate the type of the result – by which I mean, when I do something like subtract an OCDFrozenSet from a frozenset, I get a frozenset… but the same is true if I reverse the types in the operation (i.e. subtract a frozenset from an OCDFrozenSet.

Like so:

enter image description here

… what is especially counterintuitively vexing to me is the fact that using -= (subtract-in-place) mutates the type of the existing instance!

My knowledge of how to deal with this sort of thing comes strictly from C++, where the type of the operation is a forgone conclusion that is explicitly specified in a (likely templated) operator-overload function; in Python the type system is often much more implicit, but it isn’t so mutably unpredictable as that in-place operation would have me now believe.

So, what is the most expedient way to address this – I assume it involves overriding some double-underscored instance methods in the subclasses of interest?


Solution

  • The in-place operations doesn't guarantee that they will update the object the in-place, it completely depends on the type of the object.

    Tuple, frozenset etc are immutable types, hence it is not possible to update them in-place.

    From library reference on in-place operators:

    For immutable targets such as strings, numbers, and tuples, the updated value is computed, but not assigned back to the input variable.

    Similarly the frozenset docs also mention the same thing about in-place operations[source]:

    The following table lists operations available for set that do not apply to immutable instances of frozenset.


    Now, as your OCDFrozenSet doesn't implements __isub__, it will fallback to __sub__ method which will return the type of base class frozenset. The base class is used because Python has no idea about the arguments your base class would expect on the newly created frozenset from the __sub__ operation.

    More importantly this was a bug in Python 2 where such operation returned the subclass instance, the fix was only ported to Python 3 though to prevent breaking existing systems.


    To get the expected output you can provide the required methods in your subclass:

    class OCDFrozenSet(frozenset):
        def __sub__(self, other):
            return type(self)(super().__sub__(other))
    
        def __rsub__(self, other):
            return type(self)(super().__rsub__(other))