I'm using numba to develop my package, and actually can get times of speed-ups in calculations. Now I'm facing the problem that, when I run my package in command line using
python my_package_name.py
Many times are spent during importing. To demonstrate this, I used -X importtime
for testing:
python -X importtime test_my_package.py
The results are:
import time: self [us] | cumulative | imported package
...
import time: 2422 | 3620 | numpy.core._multiarray_umath
import time: 425 | 425 | numpy.compat._inspect
import time: 62 | 62 | errno
import time: 439 | 439 | urllib
import time: 1199 | 1638 | urllib.parse
import time: 1036 | 2735 | pathlib
...
import time: 1408 | 2748 | pickle
...
import time: 15060 | 24697 | numpy.core._add_newdocs_scalars
...
import time: 4627 | 111112 | numpy
...
import time: 1043 | 31561 | numba.core.config
...
import time: 1111 | 5296 | numba.core.errors
import time: 434 | 5729 | numba.core.types.common
import time: 566 | 566 | numba.core.typeconv.castgraph
import time: 379 | 944 | numba.core.typeconv
import time: 331 | 331 | numba.core.consts
import time: 1104 | 1435 | numba.core.ir
import time: 1083 | 3461 | numba.core.types.misc
import time: 1293 | 10483 | numba.core.types.containers
import time: 2129 | 2129 | logging
import time: 1775 | 3904 | numba.core.types.functions
...
import time: 9340 | 9340 | scipy._distributor_init
import time: 1915 | 1915 | scipy._lib._pep440
import time: 562 | 562 | scipy._lib._ccallback_c
import time: 952 | 1513 | scipy._lib._ccallback
import time: 2882 | 17989 | scipy
import time: 3957 | 211268 | numba
...
import time: 13506351 | 13834005 | my_package.utilities
...
import time: 3029710 | 3029710 | my_package.extract_features
import time: 4805845 | 4805845 | my_package.fast_annotate_spectrum
...
import time: 345628 | 345628 | my_package.modification_correction
...
import time: 3769461 | 3769461 | my_package.xxx.fast_annotate_spectrum
...
import time: 4831766 | 4847170 | my_package.xxx.utilities
...
import time: 9825949 | 9825949 | my_package.machinelearning.utilities
...
import time: 55041 | 102863 | scipy.stats._continuous_distns
...
import time: 1178353 | 11814805 | my_package.machinelearning.randomforest
...
import time: 2767857 | 2767857 | my_package.retrieval_utilities
The list is really long (>1300 lines), so I removed those with small importing times, but kept some times when importing numba
, numpy
and scipy
with relatively large times as benchmarks. Obviously, the importing times for the modules in my_package
are significantly large, even more than 10 sec
(i.e., my_package.utilities
). These modules contain all functions I implemented for calculations which are accelerated using numba.njit
, i.e., with decolarator @numba.njit
.
As importing all other modules are quite normal comparing to importing the Python's builtin modules, I suspect that the large importing time is due to importing those functions for numba
compilation (by @numba.njit
). Actually, when I commented out some @numba.njit
decolators in module my_package.utilities
to make them normal python functions, the importing time is reduced dramatically:
import time: 1921836 | 2647152 | my_package.utilities
Any way I can improve this?
You can cache the njit
functions:
@numba.njit(cache=True)
Which will reload the compiled functions the next time you run your program
however it can get a bit tricky to delete the cached functions if you need to make a change