Search code examples
pythonparallel-processingmultiprocessingnumbafunctools

How to use numba together with functools.reduce()


I have the following code where I am trying to parallel loop using numba, functools.reduce() and mul:

import numpy as np
from itertools import product
from functools import reduce
from operator import mul
from numba import jit, prange

lst = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
arr = np.array(lst)
n = 3
flat = np.ravel(arr).tolist()
gen = np.array([list(a) for a in product(flat, repeat=n)])

@jit(nopython=True, parallel=True)
def mtp(gen):
    results = np.empty(gen.shape[0])
    for i in prange(gen.shape[0]):
        results[i] = reduce(mul, gen[i], initializer=None)
    return results
mtp(gen)

But this is giving me an error:

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-503-cd6ef880fd4a> in <module>
     10         results[i] = reduce(mul, gen[i], initializer=None)
     11     return results
---> 12 mtp(gen)

~\Anaconda3\lib\site-packages\numba\dispatcher.py in _compile_for_args(self, *args, **kws)
    399                 e.patch_message(msg)
    400 
--> 401             error_rewrite(e, 'typing')
    402         except errors.UnsupportedError as e:
    403             # Something unsupported is present in the user code, add help info

~\Anaconda3\lib\site-packages\numba\dispatcher.py in error_rewrite(e, issue_type)
    342                 raise e
    343             else:
--> 344                 reraise(type(e), e, None)
    345 
    346         argtypes = []

~\Anaconda3\lib\site-packages\numba\six.py in reraise(tp, value, tb)
    666             value = tp()
    667         if value.__traceback__ is not tb:
--> 668             raise value.with_traceback(tb)
    669         raise value
    670 

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<built-in function reduce>) with argument(s) of type(s): (Function(<built-in function mul>), array(int32, 1d, C), initializer=none)
 * parameterized
In definition 0:
    AssertionError: 
    raised from C:\Users\HP\Anaconda3\lib\site-packages\numba\parfor.py:4138
In definition 1:
    AssertionError: 
    raised from C:\Users\HP\Anaconda3\lib\site-packages\numba\parfor.py:4138
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<built-in function reduce>)
[2] During: typing of call at <ipython-input-503-cd6ef880fd4a> (10)


File "<ipython-input-503-cd6ef880fd4a>", line 10:
def mtp(gen):
    <source elided>
    for i in prange(gen.shape[0]):
        results[i] = reduce(mul, gen[i], initializer=None)
        ^

I am not sure where I have gone wrong. Can anyone point me to the right direction? Many thanks.


Solution

  • You can use np.prod inside of a numba jitted function:

    n = 3
    lst = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
    arr = np.array(lst)
    flat = np.ravel(arr).tolist()
    gen = [list(a) for a in product(flat, repeat=n)]
    
    @jit(nopython=True, parallel=True)
    def mtp(gen):
        results = np.empty(len(gen))
        for i in prange(len(gen)):
            results[i] = np.prod(gen[i])
        return results
    

    Alternatively, you can use reduce as below (thanks to @stuartarchibald for pointing that out), although parallelization will not work below (at least as of numba 0.48):

    import numpy as np
    from itertools import product
    from functools import reduce
    from operator import mul
    from numba import njit, prange
    
    lst = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
    arr = np.array(lst)
    n = 3
    flat = np.ravel(arr).tolist()
    gen = np.array([list(a) for a in product(flat, repeat=n)])
    
    @njit
    def mul_wrapper(x, y):
        return mul(x, y)
    
    @njit
    def mtp(gen):
        results = np.empty(gen.shape[0])
        for i in prange(gen.shape[0]):
            results[i] = reduce(mul_wrapper, gen[i], None)
        return results
    
    print(mtp(gen))
    

    Or, because there's a bit of magic inside Numba that spots closures that will escape functions and compile them. (again thanks to @stuartarchibald), you can you this, below:

    @njit
    def mtp(gen):
        results = np.empty(gen.shape[0])
        def op(x, y):
            return mul(x, y)
        for i in prange(gen.shape[0]):
            results[i] = reduce(op, gen[i], None)
        return results
    

    But again, parallel doesn't work here as of numba 0.48.

    Note, the recommended approach from a member of the core dev team would be to take the first solution that uses np.prod. It can be used with the parallel flag and has a more straightforward implementation.