Search code examples
pythontimenumbajit

Numba: when to use nopython=True?


I have the following setup:

import numpy as np
import matplotlib.pyplot as plt
import timeit
import numba
@numba.jit(nopython=True, cache=True)
def f(x):
    summ = 0
    for i in x:
        summ += i
    return summ

@numba.jit(nopython=True)
def g21(N, locs):
    rvs = np.random.normal(loc=locs, scale=locs, size=N)
    res = f(rvs)
    return res

@numba.jit(nopython=False)
def g22(N, locs):
    rvs = np.random.normal(loc=locs, scale=locs, size=N)
    res = f(rvs)
    return res

g22 and g21 are the exact same function, just that one of them has nopython=True and the other nopython=False

Now I give them an input. If locs is a scalar, then the numba should be able to compile everything since they support numpy.random.normal() with this signature. However if locs is an array, numba does not support this signature and should go back to the python interpreter.

I run this first just to compile the functions

N = 10_000

g22(N, 3)
g22(N, np.linspace(0,1,N))
g21(N, 3)
# g21(N, np.linspace(0,1,N))  # returns an error

Now I run a speed comparison

%timeit g21(N, 3)
%timeit g22(N, 3)
%timeit g22(N, np.linspace(0,1,N))

which returns

274 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
270 µs ± 5.38 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
421 µs ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

It makes sense that g22(N, np.linspace(0,1,N) is slowest since it goes back to the python interpreter.

However what I dont understand is that g21(N, 3) is roughly the same speed as g22(N, 3), even though one has nopython=True and the other not.

But g22(N,3) has the big advantage that it can take another argument, namely g22(N, np.linspace(0,1,N)), so its more versatile, however at the same time there is no speed penalty to having nopython=False

So my questions are:

  1. in this case, what is the use of using nopython=True, if a function with nopython=False achieves same speed?

  2. in which specific case is nopython=True better than nopython=False?


Solution

    1. in this case, what is the use of using nopython=True, if a function with nopython=False achieves same speed?
    2. in which specific case is nopython=True better than nopython=False?

    The documentation states:

    Numba has two compilation modes: nopython mode and object mode. The former produces much faster code, but has limitations that can force Numba to fall back to the latter. To prevent Numba from falling back, and instead raise an error, pass nopython=True.

    Note that in Numba will try to compile the code to a native binary in both modes. However, nopython produces an error when this is not possible while the other produces a warning and cause a fallback code to be used.

    For some applications, performance can be critical and so you really do not want the fallback code to be called. This the case for high-performance applications for example. Having an error in this case is better than having a code which runs for days instead of few minutes on an expensive machine (like a supercomputer or a computing server). Using different version of Numba can silently cause a fallback on some machine due to feature not being supported. I personally always use the nopython mode to prevent such case (as the fallback code is generally too slow to be useful) and I consider the object mode a bit useless. Put is shortly, nopython offers stronger guarantees about performance.