Search code examples
pythondictionaryhashmapsize

How allocation of memory for `dict` in Python works?


I was playing around with Dictionaries and found this.

import sys

Square1 = {}
Square2 = {}
Square3 = {}

for i in range(1, 8):
    Square1[i] = i**2

for i in range(1, 11):
    Square2[i] = i**2

for i in range(1, 12):
    Square3[i] = i**2


print(sys.getsizeof(Square1), len(Square1))
print(sys.getsizeof(Square2), len(Square2))
print(sys.getsizeof(Square3), len(Square3))

Output:

196 7
196 10
344 11

The size of Dictionary length 7 and 10 is the same that is 196 but for length 11, it's 344. Why are they same? Why does the size rise up with length 11? How does dictionary size work in Python?


Solution

  • When you create an empty dictionary, it preallocates the memory in chunks for initial few references it can store. As the dictionary adds more key-value pairs, it needs more memory.

    But it doesn’t grow with each addition; each time it needs more space, it adds some chunk of memory which can accommodate "X" amount of key-value pairs, and once "X" amount is filled, another chunk of memory is allocated to dictionary.

    Here's a sample code to display change is dictionary's size as the count of keys increases:

    import sys
    
    my_dict = {}
    print("Size with {} keys:\t {}".format(0, sys.getsizeof(my_dict)))
    
    for i in range(21):
        my_dict[i] = ''
        print("Size with {} keys:\t {}".format(i+1, sys.getsizeof(my_dict)))
    

    Here's the output in Python 3.6.2:

    #same size for key count 0 - 5 : 240 Bytes
    Size with 0 keys:    240
    Size with 1 keys:    240
    Size with 2 keys:    240
    Size with 3 keys:    240
    Size with 4 keys:    240
    Size with 5 keys:    240
    
    #same size for key count 6 - 10 : 360 Bytes
    Size with 6 keys:    368
    Size with 7 keys:    368
    Size with 8 keys:    368
    Size with 9 keys:    368
    Size with 10 keys:   368
    
    #same size for key count 11 - 20 : 648 Bytes
    Size with 11 keys:   648
    Size with 12 keys:   648
    Size with 13 keys:   648
    Size with 14 keys:   648
    Size with 15 keys:   648
    Size with 16 keys:   648
    Size with 17 keys:   648
    Size with 18 keys:   648
    Size with 19 keys:   648
    Size with 20 keys:   648
    

    Also, dictionary just stores a reference of memory that holds the keys and values, and doesn't stores key-value itself as part of dict object. So neither the type nor the size of the data affect the result of sys.getsizeof() for the dictionary.

    For example, size of both the below dicts is 280 Bytes

    >>> sys.getsizeof({'a': 'a'})
    280
    
    >>> sys.getsizeof({'a'*100000: 'a'*1000000})
    280
    

    However here's the difference between size of 'a' V/s 'a' * 1000000:

    >>> sys.getsizeof('a')
    38
    
    >>> sys.getsizeof('a'*1000000)
    1000037