Search code examples
pythondictionarypython-internals

Why don't dictionaries resize after deletions?


Apparently deleting entries in a dictionary doesn't trigger any resizes.

This can be seen from the following:

# Drastic example, nobody does such 
# things with dicts FWIK
from sys import getsizeof

d = {i:i for i in range(100)}
print(getsizeof(d))  # 4704
for i in range(100):
    del d[i]  # similarly with pop
print(getsizeof(d))  # 4704

as well as from a question on SO (from what I've found). sets behave in a similar fashion, which is to be expected to fall in line with what dicts do.

lists, on the other hand, resize when the the new size becomes half of that already allocated; this is stated in a list_resize comment:

/* Bypass realloc() when a previous overallocation is large enough
   to accommodate the newsize.  If the newsize falls lower than half
   the allocated size, then proceed with the realloc() to shrink the list.
*/

Why is it that dictionaries (and, indirectly, sets) don't employ a similar trick and instead wait for a new entry to be inserted? The behaviour described applies for Python 2.7 and 3.x.


Solution

  • This is somewhat explained in Objects/dictnotes.txt, a companion file containing various notes on the dict implementation:

    Dictionary operations involving only a single key can be O(1) unless resizing is possible. By checking for a resize only when the dictionary can grow (and may require resizing), other operations remain O(1), and the odds of resize thrashing or memory fragmentation are reduced. In particular, an algorithm that empties a dictionary by repeatedly invoking .pop will see no resizing, which might not be necessary at all because the dictionary is eventually discarded entirely.

    One important consideration is that shrinking a list's buffer is really easy, while shrinking a dict's internal hash table is a much more complex operation.