python optimization linear-programming gurobi integer-programming

Gurobi prefix sum optimization

Consider the following Gurobi model:

import gurobipy as gb
import numpy as np
N = 100
x = np.random.randint(10, high=2*N, size=N)
model = gb.Model("ACC")
amp_i_vars = model.addVars(N, vtype=gb.GRB.BINARY, name='ai')
model.setObjective(amp_i_vars.sum(*), gb.GRB.MINIMIZE)
model.addConstrs(gb.quicksum(amp_i_vars[i] for i in range(r+1)) <= x[r] 
                 for r in range(N), "SumConstr")

Where we are essentially just trying to fill up ai with as many bits as possible such that the sum of bits up to position r is never greater than x[r].

My question is whether or not GurobiPy is "smart" in the way it goes through the constraint, i.e. if it computes a prefix sum of ai or instead, actually recomputes the sum for each r<N. The former case would be linear time, whereas the latter would be quadratic. I have a LP which contains many such sums and constraints, and I'm wondering or not it would be better to create a separate variable to store the prefix sum of each sequence to prevent GurobiPy from recomputing the sum for every constraint, but I don't want to have to do that if it already is smart enough.

Solution

Your exact formulation has O(N^2) non-zeros, so you are stuck with a O(N^2) algorithm to build it. You can avoid re-creating the expression by this more procedural loop.

import gurobipy as grb
import numpy as np
np.random.seed(10)

N = 5000
x = np.random.randint(10, high=2*N, size=N)
obj = -np.random.randint(10, high=2*N, size=N)
model = gb.Model("ACC")

# more interesting objective
amp_i_vars = model.addVars(N, vtype=grb.GRB.BINARY, name='ai', obj=obj)
model.update()
cum = grb.LinExpr()
for i, ai in amp_i_vars.items():
    cum += ai
    model.addConstr(cum <= x[i])
model.optimize()

However, you can formulate an equivalent model with O(n) non-zeros by adding a parallel list of variables representing the cumulative sum, and using the recurrence cum[i] = cum[i - 1] + x[i]. This will also lead to a model that solves much faster.

import gurobipy as grb
import numpy as np
N = 5000
np.random.seed(10)
x = np.random.randint(10, high=2*N, size=N)
obj = -np.random.randint(10, high=2*N, size=N)
model = gb.Model("ACC")

# more interesting objective function
amp_i_vars = model.addVars(N, vtype=grb.GRB.BINARY, name='ai', obj=obj)
# since cum_vars are variables, use simple upper bound
cum_vars = model.addVars(N, vtype=grb.GRB.CONTINUOUS, name='cum', ub=x)

prev_cum = 0
for i, (ai, cum) in enumerate(zip(amp_i_vars.values(), cum_vars.values())):
    model.addConstr(cum == prev_cum + ai, name="sum_constr." + str(i))
    prev_cum = cum
model.optimize()

For N=5000, this solves in 0.5 seconds versus 16 seconds for the dense model.