Search code examples
pythonarrayslisttuplespython-internals

Python Tuple vs List vs Array memory consumption


I've been reading Fluent code by Luciano Ramalho and in the chapter 'Overview of Built-in Sequences' when describing C struct behind float he states:

".. That's why an array of floats is much more compact than a tuple of floats: the array is a single object holding the raw values of a float, while the tuple consists of several objects - the tuple itself and each float object contained in it".

So I decided to confirm this and wrote a simple script.

import sys
import array

array_obj = array.array("d", [])
list_obj = []
tuple_obj = tuple()

arr_delta = []
tuple_delta = []
list_delta = []

for i in range(16):
    s1 = sys.getsizeof(array_obj)
    array_obj.append(float(i))
    s2 = sys.getsizeof(array_obj)
    arr_delta.append(s2-s1)
    
    s1 = sys.getsizeof(tuple_obj)
    tuple_obj = tuple([j for j in range(i+1)])
    s2 = sys.getsizeof(tuple_obj)
    tuple_delta.append(s2-s1)
    
    s1 = sys.getsizeof(list_obj)
    list_obj.append(i)
    s2 = sys.getsizeof(list_obj)
    list_delta.append(s2-s1)

print("Float size: ", sys.getsizeof(1.0))
print("Array size: ", sys.getsizeof(array_obj))
print("Tuple size: ", sys.getsizeof(tuple_obj))
print("List size: ", sys.getsizeof(list_obj))

print("Array allocations: ", arr_delta)
print("Tuple allocations: ", tuple_delta)
print("List allocations: ", list_delta)

Which produces the following output on my system (python 3.11.4, 64 bit):

Float size:  24
Array size:  208
Tuple size:  168
List size:  184
Array allocations:  [32, 0, 0, 0, 32, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0]
Tuple allocations:  [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]
List allocations:  [32, 0, 0, 0, 32, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0]

Allocations lists is how size of object was changing over time after appending new float (tuple recreates each time)

So, my questions are:
Aren't lists and tuples supposed to hold pointers to PyFloatObjects when arrays hold pointers to floats?
Or is it peculiarity of sys.getsizeof?
As can be seen in this SO question, C structs behind tuple and list contains arrays of pointers to PyFloatObject.

Thanks!


Solution

  • As @jonrsharpe pointed out in the comment section to my question, according to python docs:

    Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

    So, for list sys.getsizeof returns size of list object itself plus 8 bytes (on 64-bit build) for every element, because it stores not elements but pointers to them.

    The same python doc contains link to the recursive sizeof recipe. After replacing sys.getsizeof with suggested recursive sizeof I've got the following result

    Array size:  208 208
    Tuple size:  168 552
    Array allocations:  [32, 0, 0, 0, 32, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0]
    Tuple allocations:  [32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32]
    

    Array results stay the same, but tuple grow with each new element by 32 bytes = 24 bytes (PyFloatObject) + 8 bytes (64-bit pointer), which is expected.