python numpy dynamic-programming numba jit

Python: Numba njit mode for faster dynamic programming

I am new to Python. I would like to use Python for my numerical experiment, in which I need to solve many dynamic programming problems exactly. So, it is important to optimize my code for efficiency. My code actually works with @jit with Numba, but I would like to push the performance further with @njit. You could see from the below code that, I have tried to vectorize my operations inside the for loop for efficiency. As I mentioned before, @jit works fine, but with @njit, it keeps giving me error messages. It is common knowledge that solving dynamic programs exactly is computationally-intensive, and therefore I would really like to use @njit to further push the performance. I could really use some help with how to change the code to allow for @njit. Thank you so much in advance!

My code:

import numba as nb
import numpy as np

#DP computation
@nb.njit
def dp(beta,cost,wcost,decisions,number_of_stages,states):
    tbeta=1-beta
    odcost=min((cost[-max(decisions):]+wcost)/beta[-max(decisions):])
    terminal=(max(states)-states)*odcost
    L=max(states)
    D=number_of_stages
    value=np.zeros((D+1,L+1))
    choice=np.zeros((D+1,L)).astype(np.int64)
    value[-1]=terminal
    for s in range(D-1,L-2,-1):
        intmatrix=cost[:, None]+np.outer(beta,value[s+1][1:L+1])+np.outer(tbeta,value[s+1][0:L])
        choice[s]=intmatrix.T.argmin(axis=1)
        value[s][0:L]=intmatrix[choice[s],np.arange(intmatrix.shape[1])]
    
    for s in range(L-2,-1,-1):
        intmatrix=cost[:, None]+np.outer(beta,value[s+1][1:s+2])+np.outer(tbeta,value[s+1][0:s+1])
        choice[s][0:s+1]=intmatrix.T.argmin(axis=1)
        value[s][0:s+1]=intmatrix[choice[s][0:s+1],np.arange(intmatrix.shape[1])]
        
    return value, choice


#initialization
decisions=np.arange(100)
number_of_stages=200
states=np.arange(101)

np.random.seed(2021)
beta=np.append(0,np.random.uniform(0,1,max(decisions)))
wcost=np.random.uniform(0,1)
cost=np.square(beta)



value, choice=dp(beta,cost,wcost,decisions,number_of_stages,states)

Error Messages:

TypingError: No implementation of function Function(<built-in function getitem>) found for signature:
 
getitem(array(float64, 1d, C), Tuple(slice<a:b>, none))
 
There are 22 candidate implementations:
      - Of which 20 did not match due to:
      Overload of function 'getitem': File: <numerous>: Line N/A.
        With argument(s): '(array(float64, 1d, C), Tuple(slice<a:b>, none))':
       No match.
      - Of which 2 did not match due to:
      Overload in function 'GetItemBuffer.generic': File: numba\core\typing\arraydecl.py: Line 162.
        With argument(s): '(array(float64, 1d, C), Tuple(slice<a:b>, none))':
       Rejected as the implementation raised a specific error:
         TypeError: unsupported array index type none in Tuple(slice<a:b>, none)
  raised from C:\ProgramData\Anaconda3\lib\site-packages\numba\core\typing\arraydecl.py:68

Solution

It seems numba does not support all kind of numpy fancy indexing. You have to get back to loops. This is pretty common with numba, you need to forget the numpy, vectorized way of coding to come back to explicit loops.

This should do the job, I just made the loops explicit.

@nb.njit
def dp(beta, cost, wcost, decisions, number_of_stages, states):
    tbeta = 1 - beta
    odcost = min((cost[-max(decisions) :] + wcost) / beta[-max(decisions) :])
    terminal = (max(states) - states) * odcost
    L = max(states)
    D = number_of_stages
    value = np.zeros((D + 1, L + 1))
    choice = np.zeros((D + 1, L)).astype(np.int64)
    value[-1] = terminal
    for s in range(D - 1, L - 2, -1):
        intmatrix = (
            cost.reshape(-1, 1)
            + np.outer(beta, value[s + 1][1 : L + 1])
            + np.outer(tbeta, value[s + 1][0:L])
        )
        choice[s] = intmatrix.T.argmin(axis=1)
        for i in range(intmatrix.shape[1]):
            value[s][i] = intmatrix[choice[s][i], i]

    for s in range(L - 2, -1, -1):
        intmatrix = (
            cost.reshape(-1, 1)
            + np.outer(beta, value[s + 1][1 : s + 2])
            + np.outer(tbeta, value[s + 1][0 : s + 1])
        )
        choice[s][0 : s + 1] = intmatrix.T.argmin(axis=1)
        for i in range(intmatrix.shape[1]):
            value[s][0 : s + 1][i] = intmatrix[choice[s][0 : s + 1][i], i]

    return value, choice