Search code examples
pythonnumba

why is numba dict lookup so much slower than cpython?


I ran the following simple experiment:

import numba
import numpy as np
@njit
    def foo():
        d = dict()
        d[(1.0, 2)] = np.random.rand(500)
        return d
e = dict()
e[(1.0, 2)] = np.random.rand(500)
d = foo()

e and d are both dictionaries with one key each.

In [42]: %timeit e[(1.0, 2)][3]
123 ns ± 2.86 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
In [43]: %timeit d[(1.0, 2)][3]
1.99 µs ± 50.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Why is the numba dict 16 times slower?

This is with numba version 0.56.4.


Solution

  • It's probably not the dictionary lookup that's slow, but the switching context from Python->Numba->Python that adds a significant overhead (for such a simple function).

    Python dictionaries are also very fast, and well optimized. So as long as you stay within Python, that's hard to beat to begin with.

    It's going to really depend on the actual use case whether something like that matters. Here's an alternative test, doing 1000 lookups in a function:

    from numba import njit
    import numpy as np
    
    @njit
    def create_numba_dict():
        return {(i,i): i for i in range(1000)} 
    
    @njit
    def index_numba_dict(d):
        
        total = 0.0
        for i in range(1000):
            total += d[(i,i)]
    
        return total
    

    If I benchmark this with a warmup (to eliminate compilation overhead), these are the results I get:

    nb_dict = create_numba_dict()
    index_numba_dict(nb_dict)
    %timeit index_numba_dict(nb_dict)
    
    py_dict = {(i,i): i for i in range(1000)}
    %timeit sum([py_dict[(i,i)] for i in range(1000)])
    

    enter image description here

    There's of course a little more going on then only the lookup, in my experience you have to do something with the result, otherwise the compiler is sometimes clever enough to never even execute it. And it still isn't a very realistic scenario.

    But I think it does show, that in the context of Numba, it's not the lookup that's intrinsically slow. Otherwise that should keep slowing down the total duration, regardless of the amount of lookups performed.