I´m having a hard time implementing numba to my function.
Basically, I`d like to concatenate to arrays with 22 columns, if the new data hasn't been added yet. If there is no old data, the new data should become a 2d array.
The function works fine without the decorator:
@jit(nopython=True)
def add(new,original=np.array([])):
duplicate=True
if original.size!=0:
for raw in original:
for ii in range(11,19):
if raw[ii]!=new[ii]:
duplicate=False
if duplicate==False:
res=np.zeros((original.shape[0]+1,22))
res[:original.shape[0]]=original
res[-1]=new
return res
else:
return original
else:
res=np.zeros((1,22))
res[0]=new
return res
Also if I remove the last part of the code:
else:
res=np.zeros((1,22))
res[0]=new
return res
It would work with njit
So if I ignore the case, that there hasn´t been old data yet, everything would be fine.
FYI: the data I`m passing in is mixed float and np.nan.
Anybody an idea? Thank you so much in advance!
this is my error log:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-255-d05a5f4ea944> in <module>()
19 return res
20 #add(a,np.array([b]))
---> 21 add(a)
2 frames
/usr/local/lib/python3.7/dist-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
413 e.patch_message(msg)
414
--> 415 error_rewrite(e, 'typing')
416 except errors.UnsupportedError as e:
417 # Something unsupported is present in the user code, add help info
/usr/local/lib/python3.7/dist-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
356 raise e
357 else:
--> 358 reraise(type(e), e, None)
359
360 argtypes = []
/usr/local/lib/python3.7/dist-packages/numba/core/utils.py in reraise(tp, value, tb)
78 value = tp()
79 if value.__traceback__ is not tb:
---> 80 raise value.with_traceback(tb)
81 raise value
82
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function getitem>) found for signature:
>>> getitem(float64, int64)
There are 22 candidate implementations:
- Of which 22 did not match due to:
Overload of function 'getitem': File: <numerous>: Line N/A.
With argument(s): '(float64, int64)':
No match.
During: typing of intrinsic-call at <ipython-input-255-d05a5f4ea944> (7)
File "<ipython-input-255-d05a5f4ea944>", line 7:
def add(new,original=np.array([])):
<source elided>
for ii in range(11,19):
if raw[ii]!=new[ii]:
^
Update: Here is how it should work. The function shall cover three main cases
sample input for new data (1d array):
array([9.0000000e+00, 0.0000000e+00, 1.0000000e+00, 0.0000000e+00,
0.0000000e+00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e+09,
1.6494228e+09, 1.6496928e+09, 1.6497504e+09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan])
sample input for original data (2d array):
array([[4.00000000e+00, 0.00000000e+00, 1.00000000e+00, 0.00000000e+00,
0.00000000e+00, nan, 5.23000000e-01, 8.31589755e-01,
8.34804877e-01, 8.28374632e-01, 8.36090000e-01, 1.64938320e+09,
1.64966400e+09, 1.64968920e+09, 1.64975760e+09, 8.30750000e-01,
8.38020000e-01, 8.34290000e-01, 8.36090000e-01, nan,
nan, nan]])
add(new)
Output:
array([[9.0000000e+00, 0.0000000e+00, 1.0000000e+00, 0.0000000e+00,
0.0000000e+00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e+09,
1.6494228e+09, 1.6496928e+09, 1.6497504e+09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan]])
add(new,original)
Output:
array([[4.00000000e+00, 0.00000000e+00, 1.00000000e+00, 0.00000000e+00,
0.00000000e+00, nan, 5.23000000e-01, 8.31589755e-01,
8.34804877e-01, 8.28374632e-01, 8.36090000e-01, 1.64938320e+09,
1.64966400e+09, 1.64968920e+09, 1.64975760e+09, 8.30750000e-01,
8.38020000e-01, 8.34290000e-01, 8.36090000e-01, nan,
nan, nan],
[9.00000000e+00, 0.00000000e+00, 1.00000000e+00, 0.00000000e+00,
0.00000000e+00, nan, 5.73000000e-01, 9.26054500e-01,
9.31717250e-01, 9.20391750e-01, 9.34500000e-01, 1.64916360e+09,
1.64942280e+09, 1.64969280e+09, 1.64975040e+09, 9.23770000e-01,
9.37380000e-01, 9.30380000e-01, 9.34500000e-01, nan,
nan, nan]])
add(new,original)
Output:
array([[9.0000000e+00, 0.0000000e+00, 1.0000000e+00, 0.0000000e+00,
0.0000000e+00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e+09,
1.6494228e+09, 1.6496928e+09, 1.6497504e+09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan]])
The main issue is that Numba assumes that original
is a 1D array while this is not the case. The pure-Python code works because the interpreter it never execute the body of the loop for raw in original
but Numba need to compile all the code before its execution. You can solve this problem using the following function prototype:
def add(new,original=np.array([[]])): # Note the `[[]]` instead of `[]`
With that, Numba can deduce correctly that the original
array is a 2D one.
Note that specifying the dimension and types of Numpy arrays and inputs is a good method to avoid such errors and sneaky bugs (eg. due to integer/float truncation).