Search code examples
pythonnumba

Why numba is performing 100 worse than native python with Tuple of string?


For performance reasons I would like to use numba to improve the performance my code. However the numba function has a worse performance than the native python function. Can someone explain me why ?

from numba import jit
import timeit

@jit(nopython=True, fastmath=True)
def get_exact_score_with_numba(tokens_to_match, candidate_tokens):
    score = 0.
    for token in tokens_to_match:
        if token in candidate_tokens:
            score += 1.
    return score / len(tokens_to_match)


def get_exact_score_without_numba(tokens_to_match, candidate_tokens):
    score = 0.
    for token in tokens_to_match:
        if token in candidate_tokens:
            score += 1.
    return score / len(tokens_to_match)


tokens_to_match = ('a', 'b')
candidate_tokens = ('a', 'b', 'c', 'd', 'e')

Performance with timeit without numba:

>>> number = 200000
>>> timeit.timeit(lambda: get_exact_score_without_numba(tokens_to_match, candidate_tokens), number=number)
0.0962326959999995

with numba:

>>> timeit.timeit(lambda: get_exact_score_with_numba(tokens_to_match, candidate_tokens), number=number)
9.441522490000011

so numba is 100 times slower.


Solution

  • The get_exact_score_without_numba function takes 0.275 us on my machine which is a very small time for a function running in the CPython interpreter. An empty Numba function takes at least 0.25 us on my machine because of the cost to switch from CPython to a C code, make some internal checks, etc. Thus, there is no way Numba can be significantly faster on this benchmark.

    Beside this, get_exact_score_with_numba is still abnormally slow in this case since it takes 25 us on my machine. This overhead comes from Numba itself before calling your compiled function. More specifically, it appears to come from the CPython to Numba internal type conversion (mainly due to strings). Strings are not yet well supported so far in Numba (as well as byte arrays). Only an experimental support is provided so far.