Search code examples
pythonclassmemory-managementinstancecpython

What resources does an instance of a class use?


How efficient is python (cpython I guess) when allocating resources for a newly created instance of a class? I have a situation where I will need to instantiate a node class millions of times to make a tree structure. Each of the node objects should be lightweight, just containing a few numbers and references to parent and child nodes.

For example, will python need to allocate memory for all the "double underscore" properties of each instantiated object (e.g. the docstrings, __dict__, __repr__, __class__, etc, etc), either to create these properties individually or store pointers to where they are defined by the class? Or is it efficient and does not need to store anything except the custom stuff I defined that needs to be stored in each object?


Solution

  • Superficially it's quite simple: Methods, class variables, and the class docstring are stored in the class (function docstrings are stored in the function). Instance variables are stored in the instance. The instance also references the class so you can look up the methods. Typically all of them are stored in dictionaries (the __dict__).

    So yes, the short answer is: Python doesn't store methods in the instances, but all instances need to have a reference to the class.

    For example if you have a simple class like this:

    class MyClass:
        def __init__(self):
            self.a = 1
            self.b = 2
    
        def __repr__(self):
            return f"{self.__class__.__name__}({self.a}, {self.b})"
    
    instance_1 = MyClass()
    instance_2 = MyClass()
    

    Then in-memory it looks (very simplified) like this:

    enter image description here

    Going deeper

    However there are a few things that important when going deeper in CPython:

    • Having a dictionary as abstraction leads to quite a bit of overhead: You need a reference to the instance dictionary (bytes) and each entry in the dictionary stores the hash (8bytes), a pointer to a key (8bytes) and a pointer to the stored attribute (another 8 bytes). Also dictionaries generally over-allocate so that adding another attribute doesn't trigger a dictionary-resize.
    • Python doesn't have "value-types", even an integer will be an instance. That means that you don't need 4 bytes to store an integer - Python needs (on my computer) 24bytes to store the integer 0 and at least 28 bytes to store integers different from zero. However references to other objects just require 8 bytes (pointer).
    • CPython uses reference counting so each instance needs a reference count (8bytes). Also most of CPythons classes participate in the cyclic garbage collector, which incurs an overhead of another 24bytes per instance. In addition to these classes that can be weak-referenced (most of them) also have a __weakref__ field (another 8 bytes).

    At this point it's also necessary to point out that CPython optimizes for a few of these "problems":

    • Python uses Key-Sharing Dictionaries to avoid some of the memory overheads (hash and key) of instance dictionaries.
    • You can use __slots__ in classes to avoid __dict__ and __weakref__. This can give a significantly less memory-footprint per instance.
    • Python interns some values, for example if you create a small integer it will not create a new integer instance but return a reference to an already existing instance.

    Given all that and that several of these points (especially the points about optimizing) are implementation-details it's hard to give an canonical answer about the effective memory-requirements of Python classes.

    Reducing the memory footprint of instances

    However in case you want to reduce the memory-footprint of your instances definitely give __slots__ a try. They do have draw-backs but in case they don't apply to you they are a very good way to reduce the memory.

    class Slotted:
        __slots__ = ('a', 'b')
        def __init__(self):
            self.a = 1
            self.b = 1
    

    If that's not enough and you operate with lots of "value types" you could also go a step further and create extension classes. These are classes that are defined in C but are wrapped so that you can use them in Python.

    For convenience I'm using the IPython bindings for Cython here to simulate an extension class:

    %load_ext cython
    
    %%cython
    
    cdef class Extensioned:
        cdef long long a
        cdef long long b
    
        def __init__(self):
            self.a = 1
            self.b = 1
    

    Measuring the memory usage

    The remaining interesting question after all this theory is: How can we measure the memory?

    I also use a normal class:

    class Dicted:
        def __init__(self):
            self.a = 1
            self.b = 1
    

    I'm generally using psutil (even though it's a proxy method) for measuring memory impact and simply measure how much memory it used before and after. The measurements are a bit offset because I need to keep the instances in memory somehow, otherwise the memory would be reclaimed (immediately). Also this is only an approximation because Python actually does quite a bit of memory housekeeping especially when there are lots of create/deletes.

    
    import os
    import psutil
    process = psutil.Process(os.getpid())
    
    runs = 10
    instances = 100_000
    
    memory_dicted = [0] * runs
    memory_slotted = [0] * runs
    memory_extensioned = [0] * runs
    
    for run_index in range(runs):
        for store, cls in [(memory_dicted, Dicted), (memory_slotted, Slotted), (memory_extensioned, Extensioned)]:
            before = process.memory_info().rss
            l = [cls() for _ in range(instances)]
            store[run_index] = process.memory_info().rss - before
            l.clear()  # reclaim memory for instances immediately
    

    The memory will not be exactly identical for each run because Python re-uses some memory and sometimes also keeps memory around for other purposes but it should at least give a reasonable hint:

    >>> min(memory_dicted) / 1024**2, min(memory_slotted) / 1024**2, min(memory_extensioned) / 1024**2
    (15.625, 5.3359375, 2.7265625)
    

    I used the min here mostly because I was interested what the minimum was and I divided by 1024**2 to convert the bytes to MegaBytes.

    Summary: As expected the normal class with dict will need more memory than classes with slots but extension classes (if applicable and available) can have an even lower memory footprint.

    Another tools that could be very handy for measuring memory usage is memory_profiler, although I haven't used it in a while.