Search code examples
pythondictionarydata-structuresimmutability

What would a "frozen dict" be?


  • A frozen set is a frozenset.
  • A frozen list could be a tuple.
  • What would a frozen dict be? An immutable, hashable dict.

I guess it could be something like collections.namedtuple, but that is more like a frozen-keys dict (a half-frozen dict). Isn't it?

A "frozendict" should be a frozen dictionary, it should have keys, values, get, etc., and support in, for, etc.

update :
* there it is : https://www.python.org/dev/peps/pep-0603


Solution

  • Python doesn't have a builtin frozendict type. It turns out this wouldn't be useful too often (though it would still probably be useful more often than frozenset is).

    The most common reason to want such a type is when memoizing function calls for functions with unknown arguments. The most common solution to store a hashable equivalent of a dict (where the values are hashable) is something like tuple(sorted(kwargs.items())).

    This depends on the sorting not being a bit insane. Python cannot positively promise sorting will result in something reasonable here. (But it can't promise much else, so don't sweat it too much.)


    You could easily enough make some sort of wrapper that works much like a dict. It might look something like (In Python 3.10 and later, replace collections.Mapping with collections.abc.Mapping):

    import collections
    
    class FrozenDict(collections.Mapping):
        """Don't forget the docstrings!!"""
        
        def __init__(self, *args, **kwargs):
            self._d = dict(*args, **kwargs)
            self._hash = None
    
        def __iter__(self):
            return iter(self._d)
    
        def __len__(self):
            return len(self._d)
    
        def __getitem__(self, key):
            return self._d[key]
    
        def __hash__(self):
            # It would have been simpler and maybe more obvious to 
            # use hash(tuple(sorted(self._d.iteritems()))) from this discussion
            # so far, but this solution is O(n). I don't know what kind of 
            # n we are going to run into, but sometimes it's hard to resist the 
            # urge to optimize when it will gain improved algorithmic performance.
            if self._hash is None:
                hash_ = 0
                for pair in self.items():
                    hash_ ^= hash(pair)
                self._hash = hash_
            return self._hash
    

    It should work great:

    >>> x = FrozenDict(a=1, b=2)
    >>> y = FrozenDict(a=1, b=2)
    >>> x is y
    False
    >>> x == y
    True
    >>> x == {'a': 1, 'b': 2}
    True
    >>> d = {x: 'foo'}
    >>> d[y]
    'foo'