Search code examples
pythonperformancenumpycplexinteger-programming

Need to speed up the operations on numpy arrays in python


I am solving an integer programming model by importing Cplex as a library in Python. Let's say the optimization problem has a constraint in the following form (Ax = b): x0+x1+x1+x3 = 1

The indices of the x variables in this constraint are 0,1,1, and 3. These are stored in a list: indices=[0,1,1,3] The coefficients of these variables are also stored in another list coeff = [1,1,1,1]

Cplex cannot accept duplicate indices, so the constraint should look like this:

x0+2x1+x3 = 1

so the two lists should update like this:

indices=[0,1,3]
 coeff = [1,2,1]

I have this function that takes indices and coefficients as two arguments and gives me the manipulated lists:

def manipulate(indices, coeff):
    u = np.unique(indices)
    sums  = { ui:np.sum([  coeff[c[0]] for c in np.argwhere(indices == ui) ]) for     ui in u }
    return list(sums.keys()),list(sums.values())

So, manipulate([0,1,1,3], [1,1,1,1]) returns ([0, 1, 3], [1, 2, 1]).

My problem is when I have so many variables, the lists can have a length of a million and I have millions of such constraints. And when I solve my optimization problem using cplex, the program becomes very slow. I tracked the time spent in each function and realized the most time consuming parts of my code are these calculations and I guess it is because of using numpy. I need to make this function more efficient to hopefully decrease the execution time. could you please share any thoughts and suggestions with me on how to change the function manipulate?

Thanks a lot,


Solution

  • Without going for native-code based extensions, there are probably always compromises:

    • Numpy / Vectorization approaches miss hash-based algorithmics and imho will suffer from algorithmic-complexity drawbacks (e.g. a need to sort; need to do multiple passes ...)

    • Python-based hash-based approaches will suffer from slow looping.

    Nonetheless i do think, that your approach somewhat approaches the worst of both worlds and you could gain something.

    Some code

    from time import perf_counter as pc
    from collections import defaultdict
    import numpy as np
    np.random.seed(0)
    
    def manipulate(indices, coeff):
        u = np.unique(indices)
        sums  = { ui:np.sum([  coeff[c[0]] for c in np.argwhere(indices == ui) ]) for     ui in u }
        return list(sums.keys()),list(sums.values())
    
    # this assumes pre-sorted indices
    def manipulate_alt(indices, coeff):
      unique, indices = np.unique(indices, return_index=True)
      cum_coeffs = np.cumsum(coeff)
      bla = cum_coeffs[indices-1]
      blub = np.roll(bla, -1)
      bluab = np.ediff1d(blub, to_begin=blub[0])
    
      return unique.tolist(), bluab.tolist()
    
    def manipulate_pure_py(indices, coeff):
      final = defaultdict(int)
      n = len(indices)
      for i in range(n):
        final[indices[i]] += coeff[i]
    
      return list(final.keys()), list(final.values())
    
    # BAD NON-SCIENTIFIC BENCHMARK
    # ----------------------------
    
    ITERATIONS = 10
    SIZE = 1000000
    
    accu_time_manipulate = 0.0
    accu_time_manipulate_alt = 0.0
    accu_time_manipulate_py = 0.0
    
    for i in range(ITERATIONS):
      indices = np.sort(np.random.randint(0, 10000, size=SIZE))
      coeffs = np.random.randint(1, 100, size=SIZE)
    
      start = pc()
      sol = manipulate(indices, coeffs)
      end = pc()
      accu_time_manipulate += (end-start)
    
      start = pc()
      sol_alt = manipulate_alt(indices, coeffs)
      end = pc()
      accu_time_manipulate_alt += (end-start)
    
      start = pc()
      sol_py = manipulate_pure_py(indices, coeffs)
      end = pc()
      accu_time_manipulate_py += (end-start)
    
      assert sol[0] == sol_alt[0]
      assert sol[1] == sol_alt[1]
    
      assert sol[0] == sol_py[0]
      assert sol[1] == sol_py[1]
    
    
    print(accu_time_manipulate)
    print(accu_time_manipulate_alt)
    print(accu_time_manipulate_py)
    

    Timings

    164.34614480000005
    0.24998690000001744
    8.751806900000059
    

    Remarks

    • The benchmark is bad
    • I don't trust the numpy-based hack 100%
    • Just use the simple dict-based approach (or go the native-code route)
    • (Depending on the modelling-task: you might also combine this merging while building the model: do it directly and not delay until post-processing)