I am using list comprehension
to gather data from list
of tuples
. Code below:
data = [result[0] for result in results] #results is a list of tuples and i take first element from each tuple.
This works and all good.
Recently i came across numba
module which will increase speed of loop execution ?
So i tried this to test timing:
import numba
from numba import literal_unroll
from datetime import datetime
import logging
numba_logger = logging.getLogger('numba')
numba_logger.setLevel(logging.WARNING)
@numba.jit(nopython=True)
def loop_faster(results):
for result in literal_unroll(results):
print(result)
tuples = (1.1, "Hello", 1, "World", "Tuple-1")
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
loop_faster(tuples)
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
for result in tuples:
print(result)
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
i refereed this link for literal_unroll
: https://numba.pydata.org/numba-doc/dev/reference/pysupported.html
However it seems for loop performs way better than numba
method.
Results for above program :
2021-03-04 10:51:36.385
1.1
Hello
1
World
Tuple-1
2021-03-04 10:51:47.234
1.1
Hello
1
World
Tuple-1
2021-03-04 10:51:47.236
Why is this behavior ? numba took almost over 10-Seconds
For my case to form a list out of tuple's nth element , how can i implement using numba
module ?
The reason is very simple, on first run of function a lot of time is spend in compiling Ayour code into C++ code and machine code, which is doing numba's JIT.
So you have to add param cache = True
to you @numba.jit
decorator to pre-cache compiled version. Also you have to call one time run before measuring time to ensure compilation. Also you have to run more loops iterations to mesure time more precisely, just 10 milliseconds run is not enough.
Following code does three things mentioned above. You can see that numba gives 5.5x
times speedup.
Also I modified your code for a bit different logic, because printing logic will not measure time correctly. Numba is meant for very computational heavy code, not for printing to console. So I just created random integers array and computed this array +1. This is enough as an example code for you to see that Numba runs much faster.
import numba, logging, random
from datetime import datetime
numba_logger = logging.getLogger('numba')
numba_logger.setLevel(logging.WARNING)
@numba.njit(cache = True)
def loop_faster(results, n):
for i in range(n):
res = []
for result in numba.literal_unroll(results):
res.append(result + 1)
t = tuple(random.randrange(1 << 20) for i in range(100))
loop_faster(t, 10) # pre-compile numba
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
loop_faster(t, 1 << 16)
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
for i in range(1 << 16):
res = []
for result in t:
res.append(result + 1)
print(datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3])
Output:
2021-03-04 12:18:04.491
2021-03-04 12:18:04.840
2021-03-04 12:18:06.774