Search code examples
pythondefaultdict

Default dict with a non-trivial default


I want to create a "default dict" that performs a non-trivial operation on the missing key (like a DB lookup, for example). I've seen some old answers on here, like Using the key in collections.defaultdict, that recommend subclassing collections.defaultdict.

While this makes sense, is there a reason to use defaultdict at this point. Why not simply subclass dict and override its __missing__ method instead? Does defaultdict provide something else that I'd gain by subclassing it?


Solution

  • What does defaultdict add?

    According to the documentation, the only difference between a defaultdict and a built-in dict is:

    It overrides one method and adds one writable instance variable.

    The one method is the __missing__ method which is called when a key that is not present is accessed.

    And the one writable instance variable is the default_factory - a callable with no arguments used by __missing__ to determine the default value to be used with missing keys.

    Roughly equivalent to:

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key)
        self[key] = self.default_factory()
        return self[key]
    

    When to inherit at all?

    It is important to make it clear that the only reason you would even need to create a subclass is if your default value for missing keys is dependent of the actual key. If your default factory doesn't need to key - no matter how complicated the logic is, you can just use defaultdict instead of inheriting from it. If the logic is too much for a lambda, you can still of course create a function and use it:

    def calc():
        # very long code
        # calculating a static new key
        # (maybe a DB request to fetch the latest record...)
        return new_value
    
    d = defaultdict(calc)
    

    If you actually need the key itself for the calculation of the default value, then you need to inherit:

    When to inherit from defaultdict?

    The main advantage is if you want to be able to have a dynamic factory (i.e. change the default_factory during runtime) this saves you the bother of implementing that yourself (no need to override __init__...).

    But, note that this means you will have to take in account the existence of this default_factory when you override __missing__, as can be seen in this answer.

    When to inherit from dict

    When you don't care about dynamically changing the factory and can be satisfied with a static one throughout the existence of the dict.

    In this case you simply override the __missing__ method and implement the factory with whatever complicated logic you have dependent of the key.