@jit slowing down function

I'm developing an optimization code for a complex reservoir operations problem. Part of this requires me to calculate the objective function for a large number of potential solutions. I'm testing the optimizer on the Rosenbrock function and trying to improve its speed. I noticed when I profiled the code that calculating the objective function within a for loop was one of the code bottlenecks so I developed a way to do this in parallel for multiple sets of decision variables. I have two objective function calculators: FO for one set of decision variables and P_FO for multiple sets of decision variables. Calculation of the objective function is one of the slowest parts of my code so I would like to speed things up further using @jit. I tested both functions using @jit and am finding the P_FO function is slower with @jit than without it. The code is below:

import time
import numpy as np
from numba import jit 

def FO(X):
    #Rosenbrock function
    ObjV=0
    for i in range(65-1):
        F=100*((X[i+1]-X[i]**2)+(X[i]-1)**2)
        ObjV+=F
    return ObjV

t0=time.time()
X=10+np.zeros(65)
for i in range(5000):
    FO(X)
t1 = time.time()
total = t1-t0
print("time FO="+str(total))

@jit
def FO(X):
    #Rosenbrock function
    ObjV=0
    for i in range(65-1):
        F=100*((X[i+1]-X[i]**2)+(X[i]-1)**2)
        ObjV+=F
    return ObjV

t0=time.time()
X=10+np.zeros(65)
for i in range(5000):
    FO(X)
t1 = time.time()
total = t1-t0
print("time FO with @jit="+str(total))



def P_FO(X):
    ObjV=np.zeros(X.shape[0])  
    for i in range(X.shape[1]-1):
        F=100*((X[:,i+1]-X[:,i]**2)+(X[:,i]-1)**2)
        ObjV+=F
    return ObjV       

t0=time.time()
X=10+np.zeros((65, 5000))
P_FO(X)
t1 = time.time()
total = t1-t0
print("time P_FO="+str(total))


@jit
def P_FO(X):
    ObjV=np.zeros(X.shape[0])  
    for i in range(X.shape[1]-1):
        F=100*((X[:,i+1]-X[:,i]**2)+(X[:,i]-1)**2)
        ObjV+=F
    return ObjV       

t0=time.time()
X=10+np.zeros((65, 5000))
P_FO(X)
t1 = time.time()
total = t1-t0
print("time P_FO with @jit="+str(total))

Results were:

time FO=0.523999929428
time FO with @jit=0.0720000267029
time P_FO=0.0380001068115
time P_FO with @jit=0.229000091553

Can anyone point me to a reason why @jit slows down the parallel objective function P_FO? Is it because of the use of np.zeros or perhaps array.shape()?

Solution

numba functions are compiled lazily, i.e., not until called the first time, so your timings are a capturing the one-time compilation overhead. If I call each function once before running the timing sections, I get:

time FO=0.4103426933288574
time FO with @jit=0.0020008087158203125
time P_FO=0.04154801368713379
time P_FO with @jit=0.004002809524536133