Search code examples
pythonpython-3.xpython-2.7cpythonpython-internals

Different object size of True and False in Python 3


Experimenting with magic methods (__sizeof__ in particular) on different Python objects I stumbled over the following behaviour:

Python 2.7

>>> False.__sizeof__()
24
>>> True.__sizeof__()
24

Python 3.x

>>> False.__sizeof__()
24
>>> True.__sizeof__()
28

What changed in Python 3 that makes the size of True greater than the size of False?


Solution

  • It is because bool is a subclass of int in both Python 2 and 3.

    >>> issubclass(bool, int)
    True
    

    But the int implementation has changed.

    In Python 2, int was the one that was 32 or 64 bits, depending on the system, as opposed to arbitrary-length long.

    In Python 3, int is arbitrary-length - the long of Python 2 was renamed to int and the original Python 2 int dropped altogether.


    In Python 2 you get the exactly same behaviour for long objects 1L and 0L:

    Python 2.7.15rc1 (default, Apr 15 2018, 21:51:34) 
    [GCC 7.3.0] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.getsizeof(1L)
    28
    >>> sys.getsizeof(0L)
    24
    

    The long/Python 3 int is a variable-length object, just like a tuple - when it is allocated, enough memory is allocated to hold all the binary digits required to represent it. The length of the variable part is stored in the object head. 0 requires no binary digits (its variable length is 0), but even 1 spills over, and requires extra digits.

    I.e. 0 is represented as binary string of length 0:

    <>
    

    and 1 is represented as a 30-bit binary string:

    <000000000000000000000000000001>
    

    The default configuration in Python uses 30 bits in a uint32_t; so 2**30 - 1 still fits in 28 bytes on x86-64, and 2**30 will require 32;

    2**30 - 1 will be presented as

    <111111111111111111111111111111>
    

    i.e. all 30 value bits set to 1; 2**30 will need more, and it will have internal representation

    <000000000000000000000000000001000000000000000000000000000000>
    

    As for True using 28 bytes instead of 24 - you need not worry. True is a singleton and therefore only 4 bytes are lost in total in any Python program, not 4 for every usage of True.