I am trying to get the actual size of a dict in the memory. I got a weird results and I am looking forward to your feedback.
a = {}
for i in range(2):
a[i] = {}
for j in range(1000):
a[i][j] = j
sys.getsizeof(a), sys.getsizeof(a[0]), sys.getsizeof(a[1])
the results is 272, 49424, 49424 bytes. I expect the size of a is the sum of a[0] and a[1].
but if tried the following
a = {}
for i in range(2000):
a[i] = [i,i,i]
sys.getsizeof(a)
size of a = 196880 bytes. The first one has 2000 keys and the second one has 2 keys and each one has dict with 1000 keys.
You need to determine the size of the dict, and the sizes of all its keys and values, recursively (I wish Python had a built-in function to do this). I have used variations of this receipe a number of times:
import sys
def get_size(obj, seen=None):
"""Recursively finds size of objects"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if isinstance(obj, dict):
size += sum([get_size(v, seen) for v in obj.values()])
size += sum([get_size(k, seen) for k in obj.keys()])
elif hasattr(obj, '__dict__'):
size += get_size(obj.__dict__, seen)
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum([get_size(i, seen) for i in obj])
return size
I have occasionally had to make versions of this that work for other custom types, Numpy arrays, and the like. Sadly there's no perfect generic solution.