python performance numpy cplex integer-programming

Need to speed up the operations on numpy arrays in python

I am solving an integer programming model by importing Cplex as a library in Python. Let's say the optimization problem has a constraint in the following form (Ax = b): x0+x1+x1+x3 = 1

The indices of the x variables in this constraint are 0,1,1, and 3. These are stored in a list: indices=[0,1,1,3] The coefficients of these variables are also stored in another list coeff = [1,1,1,1]

Cplex cannot accept duplicate indices, so the constraint should look like this:

x0+2x1+x3 = 1

so the two lists should update like this:

indices=[0,1,3]
 coeff = [1,2,1]

I have this function that takes indices and coefficients as two arguments and gives me the manipulated lists:

def manipulate(indices, coeff):
    u = np.unique(indices)
    sums  = { ui:np.sum([  coeff[c[0]] for c in np.argwhere(indices == ui) ]) for     ui in u }
    return list(sums.keys()),list(sums.values())

So, manipulate([0,1,1,3], [1,1,1,1]) returns ([0, 1, 3], [1, 2, 1]).

My problem is when I have so many variables, the lists can have a length of a million and I have millions of such constraints. And when I solve my optimization problem using cplex, the program becomes very slow. I tracked the time spent in each function and realized the most time consuming parts of my code are these calculations and I guess it is because of using numpy. I need to make this function more efficient to hopefully decrease the execution time. could you please share any thoughts and suggestions with me on how to change the function manipulate?

Thanks a lot,

Solution

Without going for native-code based extensions, there are probably always compromises:

Numpy / Vectorization approaches miss hash-based algorithmics and imho will suffer from algorithmic-complexity drawbacks (e.g. a need to sort; need to do multiple passes ...)
Python-based hash-based approaches will suffer from slow looping.

Nonetheless i do think, that your approach somewhat approaches the worst of both worlds and you could gain something.

Some code

from time import perf_counter as pc
from collections import defaultdict
import numpy as np
np.random.seed(0)

def manipulate(indices, coeff):
    u = np.unique(indices)
    sums  = { ui:np.sum([  coeff[c[0]] for c in np.argwhere(indices == ui) ]) for     ui in u }
    return list(sums.keys()),list(sums.values())

# this assumes pre-sorted indices
def manipulate_alt(indices, coeff):
  unique, indices = np.unique(indices, return_index=True)
  cum_coeffs = np.cumsum(coeff)
  bla = cum_coeffs[indices-1]
  blub = np.roll(bla, -1)
  bluab = np.ediff1d(blub, to_begin=blub[0])

  return unique.tolist(), bluab.tolist()

def manipulate_pure_py(indices, coeff):
  final = defaultdict(int)
  n = len(indices)
  for i in range(n):
    final[indices[i]] += coeff[i]

  return list(final.keys()), list(final.values())

# BAD NON-SCIENTIFIC BENCHMARK
# ----------------------------

ITERATIONS = 10
SIZE = 1000000

accu_time_manipulate = 0.0
accu_time_manipulate_alt = 0.0
accu_time_manipulate_py = 0.0

for i in range(ITERATIONS):
  indices = np.sort(np.random.randint(0, 10000, size=SIZE))
  coeffs = np.random.randint(1, 100, size=SIZE)

  start = pc()
  sol = manipulate(indices, coeffs)
  end = pc()
  accu_time_manipulate += (end-start)

  start = pc()
  sol_alt = manipulate_alt(indices, coeffs)
  end = pc()
  accu_time_manipulate_alt += (end-start)

  start = pc()
  sol_py = manipulate_pure_py(indices, coeffs)
  end = pc()
  accu_time_manipulate_py += (end-start)

  assert sol[0] == sol_alt[0]
  assert sol[1] == sol_alt[1]

  assert sol[0] == sol_py[0]
  assert sol[1] == sol_py[1]


print(accu_time_manipulate)
print(accu_time_manipulate_alt)
print(accu_time_manipulate_py)

Timings

164.34614480000005
0.24998690000001744
8.751806900000059

Remarks

The benchmark is bad
I don't trust the numpy-based hack 100%
Just use the simple dict-based approach (or go the native-code route)
(Depending on the modelling-task: you might also combine this merging while building the model: do it directly and not delay until post-processing)