Search code examples
pythonpython-importnumba

python numba compiled functions import very slow


I'm using numba to develop my package, and actually can get times of speed-ups in calculations. Now I'm facing the problem that, when I run my package in command line using

python my_package_name.py

Many times are spent during importing. To demonstrate this, I used -X importtime for testing:

python -X importtime test_my_package.py

The results are:

import time: self [us] | cumulative | imported package
...
import time:      2422 |       3620 |               numpy.core._multiarray_umath
import time:       425 |        425 |                   numpy.compat._inspect
import time:        62 |         62 |                       errno
import time:       439 |        439 |                         urllib
import time:      1199 |       1638 |                       urllib.parse
import time:      1036 |       2735 |                     pathlib
...
import time:      1408 |       2748 |                     pickle
...
import time:     15060 |      24697 |           numpy.core._add_newdocs_scalars
...
import time:      4627 |     111112 |       numpy
...
import time:      1043 |      31561 |           numba.core.config
...
import time:      1111 |       5296 |                 numba.core.errors
import time:       434 |       5729 |               numba.core.types.common
import time:       566 |        566 |                   numba.core.typeconv.castgraph
import time:       379 |        944 |                 numba.core.typeconv
import time:       331 |        331 |                   numba.core.consts
import time:      1104 |       1435 |                 numba.core.ir
import time:      1083 |       3461 |               numba.core.types.misc
import time:      1293 |      10483 |             numba.core.types.containers
import time:      2129 |       2129 |               logging
import time:      1775 |       3904 |             numba.core.types.functions
...
import time:      9340 |       9340 |             scipy._distributor_init
import time:      1915 |       1915 |             scipy._lib._pep440
import time:       562 |        562 |               scipy._lib._ccallback_c
import time:       952 |       1513 |             scipy._lib._ccallback
import time:      2882 |      17989 |           scipy
import time:      3957 |     211268 |         numba
...
import time:  13506351 |   13834005 |       my_package.utilities
...
import time:   3029710 |    3029710 |       my_package.extract_features
import time:   4805845 |    4805845 |       my_package.fast_annotate_spectrum
...
import time:    345628 |     345628 |         my_package.modification_correction
...
import time:   3769461 |    3769461 |             my_package.xxx.fast_annotate_spectrum
...
import time:   4831766 |    4847170 |               my_package.xxx.utilities
...
import time:   9825949 |    9825949 |             my_package.machinelearning.utilities
...
import time:     55041 |     102863 |                                   scipy.stats._continuous_distns
...
import time:   1178353 |   11814805 |           my_package.machinelearning.randomforest
...
import time:   2767857 |    2767857 |       my_package.retrieval_utilities

The list is really long (>1300 lines), so I removed those with small importing times, but kept some times when importing numba, numpy and scipy with relatively large times as benchmarks. Obviously, the importing times for the modules in my_package are significantly large, even more than 10 sec (i.e., my_package.utilities). These modules contain all functions I implemented for calculations which are accelerated using numba.njit, i.e., with decolarator @numba.njit.

As importing all other modules are quite normal comparing to importing the Python's builtin modules, I suspect that the large importing time is due to importing those functions for numba compilation (by @numba.njit). Actually, when I commented out some @numba.njit decolators in module my_package.utilities to make them normal python functions, the importing time is reduced dramatically:

import time:   1921836 |    2647152 |       my_package.utilities

Any way I can improve this?


Solution

  • You can cache the njit functions:

    @numba.njit(cache=True)
    

    Which will reload the compiled functions the next time you run your program

    however it can get a bit tricky to delete the cached functions if you need to make a change