I am trying to optimize this function with numba:
@jit(nopython=False, forceobj=True)
def _packing_loop(data: List[str], indices: np.ndarray, strings: List[str], offsets: List[int], str_map: Dict[str, int], acc_len):
for i, s in enumerate(data):
# handle null strings.
if not s:
indices[i] = -1
continue
index = str_map.get(s)
if index is None:
# increment the length
acc_len += len(s)
# store the string and index
index = len(strings)
strings.append(s)
# strings += s,
str_map[s] = index
# write the offset
offsets.append(acc_len)
# offsets += acc_len,
indices[i] = index
The issue is that the optimized code is ~1.5 times slower (even if I pre-run the function once before benchmarking).
What is the possible reason? I would be also grateful for any suggestions on how to actually optimize this function.
P.S. I am not limited to numba, other approaches are also possible.
forceobj
generate an inefficient code. To quote the documentation:
If true, forceobj forces the function to be compiled in object mode. Since object mode is slower than nopython mode, this is mostly useful for testing purposes.
You should not use it in production. Besides, Numba do not supports well reflected lists. You should use typed lists for sake of performance. In fact, typing is what makes Numba fast. Without types, Numba cannot generate a compiled code and a slow interpreted version is executed. Note that strings are barely supported and clearly not efficient yet. AFAIK, Numba does not benefit from mypy type annocation like List[str]
. I advise you to use Cython in this case. That being said, the speed up will be certainly small since you are dealing mostly with slow CPython dynamic objects.