Search code examples
python-3.xdictionarycounter

Deepcopy for instances of class inheriting from Counter vs dict


I am writing custom classes that inherit either from the Python class dict, or collections.Counter and I am facing problem with the behaviour of deepcopy. The problem is basically that deepcopy works as intended when inheriting from dict but not from Counter.

Here is an example:

from copy import deepcopy
from collections import Counter

class MyCounter(Counter):
    def __init__(self, foo):
        self.foo = foo

class MyDict(dict):
    def __init__(self, foo):
        self.foo = foo


c = MyCounter(0)
assert c.foo == 0  # Success
c1 = deepcopy(c)
assert c1.foo == 0  # Failure


d = MyDict(0)
assert d.foo == 0  # Success
d1 = deepcopy(d)
assert d1.foo == 0  # Success

I am a bit clueless as to why this is happening given that the source code of the Counter class does not seem to change anything about the deepcopy (no custom __deepcopy__ method for instance).

I understand that I may have to write a custom __deepcopy__ method but it's not clear to me how to. In general I would rather not have to do that given that it works perfectly for dict.

Any help will be much appreciated.


Solution

  • deepcopy has several fallbacks, covered in the answer here

    In this case, your particular base class Counter specialize pickle serialization, which is what deepcopy will pick up on (as the second option, as no specialization for __deepcopy__ happens to exist).

    If you step through the code in a debugger, you'll find it ends up at Counter's __reduce__ method, where the python 3.9 implementation of Counter has:

        def __reduce__(self):
            return self.__class__, (dict(self),)
    

    where we see can see where information is lost as Counter's implementation here relies on that there aren't any other fields stored in this object other than the dictionary part itself.

    You could overload __reduce__ or __reduce_ex__, which would fix pickling and as a bonus also fix deepcopy, or you could overload __deepcopy__ and provide the necessary implementation for it.

    Implementing our own deepcopy isn't to complex, and we can keep the code very simple:

    class MyCounter(Counter):
        def __init__(self, foo):
            self.foo = foo
            
        def __deepcopy__(self, memo):
            copy_instance = MyCounter(deepcopy(self.foo, memo))
            for key, val in self.items():
                copy_instance[deepcopy(key, memo)] = val  # val is just an int
            return copy_instance
    
    c = MyCounter(123)
    c['deep'] = 1
    c['copy'] = 2
    
    c1 = deepcopy(c)
    assert c1.foo == c.foo
    assert c1['deep'] == c['deep']
    assert c1['copy'] == c['copy']
    

    (In most cases I would probably recommend against overloading Counter or dict in order to add more attributes to them, but rather compose a custom class that has a counter or dict instance variable instead.)