Search code examples
pythondictionarycpython

Can a callout to C presize a Python dict's capacity?


As an optimization for handling a dict which will hold tens or hundreds of millions of keys, I'd really, really like to pre-size its capacity... but there seems no Pythonic way to do so.

Is it practical to use Cython or C callouts to directly call CPython's internal functions, such as dictresize() or _PyDict__NewPresized(), to achieve this?


Solution

  • It depends on what you mean by practical. It's certainly straightforward enough; you can just call _PyDict_NewPresized(howevermany). Heck, you can even do it from Python:

    >>> import ctypes
    >>> import sys
    >>> ctypes.pythonapi._PyDict_NewPresized.restype = ctypes.py_object
    >>> d = ctypes.pythonapi._PyDict_NewPresized(100)
    >>> sys.getsizeof(d)
    1676
    >>> sys.getsizeof({})
    140
    >>> len(d)
    0
    

    As you can see, the dict is presized, but it has no elements. Whether depending on CPython implementation details like this is practical is up to you.