Search code examples
performancenumpyprocessing-efficiency

Where can I find the efficiency O(n) of some Numpy methods?


I am doing a school project and they've asked about the efficiency O(n) of some Numpy methods and I can't find them. Can anyone tell me where can I find those?

Example methods like:

numpy.linspace(x,y,z) numpy.meshgrid(x,y) numpy.zeroes(x,y)


Solution

  • You could simply measure the execution time for different problem sizes to get an estimate of the time complexity,

    • numpy.zeros(n) : non-deterministic
    • numpy.meshgrid(x,y) : O(n**2)
    • numpy.linspace(0, 1, n) : O(n**1.6)

    For instance, below is a code to measure the time complexity for numpy.meshgrid(x,y), that can be used for other numpy functions as well,

    In [1]: import numpy as np
       ...: from time import time
       ...: import matplotlib.pyplot as plt
       ...: from scipy.optimize import curve_fit
       ...: %matplotlib inline
       ...: 
       ...: def complexity_model(x, n, A, C):
       ...:     return A*x**n + C
       ...: 
       ...: problem_size = np.logspace(2, 4, 10)
       ...: 
       ...: res = []
       ...: for N in problem_size:
       ...:     x = np.linspace(0, 1, N)
       ...:     y = x.copy()
       ...:     
       ...:     t0 = time()
       ...:     np.meshgrid(x,y)
       ...:     dt = time() - t0
       ...:     res.append(dt)
       ...: 
       ...: nn = np.logspace(np.log10(problem_size.min()), np.log10(problem_size.max()), 100)  
       ...: 
       ...: time_to_solution = np.asarray(res)
       ...: fig, ax = plt.subplots(1,1)
       ...: ax.loglog(problem_size, time_to_solution, 'o-b')
       ...: 
       ...: mask = problem_size > 100 # ignore initial points
       ...: 
       ...: popt, _ = curve_fit(complexity_model, problem_size[mask],
       ...:                               time_to_solution[mask],
       ...:                               p0=(1.0, 1.0,  0.0) )
       ...: print(popt)
       ...: ax.loglog(nn, complexity_model(nn, *popt), '--k')
       ...: 
       ...: 
       ...: ax.set_xlabel('Problem size: N')
       ...: ax.set_ylabel('Time to solution 
      [  1.94816942e+00   1.40955397e-08  -7.33862899e-04]
    

    which gives the following curve,

    enter image description here

    For sufficiently large array sizes, numpy.meshgrid(x,y) has thus a time complexity of O(n**α), with α = 1.95 ≈ 2.