I have a number of variables each assigned an integer value. I need to split these variables in three groups with a predefined number of variables going into each group while optimizing towards predefined sums of the values in each group. Each group sum should be as close as possible to the predefined value, but can be above or below. All variables should be used and each variable can only be used once.
For example, I might have 10 variables...
Variable | Value |
---|---|
A1 | 98 |
A2 | 20 |
A3 | 30 |
A4 | 50 |
A5 | 18 |
A6 | 34 |
A7 | 43 |
A8 | 21 |
A9 | 32 |
A10 | 54 |
...and the goal could be to create three groups:
Group | #Variables | Sum optimized towards |
---|---|---|
X | 6 | 200 |
Y | 2 | 100 |
Z | 2 | 100 |
So group X should hold 6 variables and their sums should be as close as possible to 200 - but I need to optimize for each of the groups simultanously.
I've tried to set up PuLP
to perform this task. I seem to have found a solution for creating a single group, but I cannot figure out how to split the variables into groups and optimize the assignments based on the sums for each group. Is there a way to do this?
Below is my code for producing the first group with the presented variables.
from pulp import LpMaximize, LpMinimize, LpProblem, lpSum, LpVariable, PULP_CBC_CMD, value, LpStatus
keys = ["A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10"]
data = [98,20,30,50,20,34,43,21,32,54]
problem_name = 'repex'
prob = LpProblem(problem_name, LpMaximize)
optiSum = 200 # Optimize towards this sum
variableCount = 6 # Number of variables that should be in the group
# Create decision variables
decision_variables = []
for i,n in enumerate(data):
variable = i
variable = LpVariable(str(variable), lowBound = 0, upBound = 1, cat= 'Binary')
decision_variables.append(variable)
# Add constraints
sumConstraint = "" # Constraint on sum of data elements
for i, n in enumerate(decision_variables):
formula = data[i]*n
sumConstraint += formula
countConstraint = "" # Constrain on number of elements used
for i, n in enumerate(decision_variables):
formula = n
countConstraint += formula
prob += (sumConstraint <= optiSum)
prob += (countConstraint == variableCount)
prob += sumConstraint
# Solve
optimization_result = prob.solve(PULP_CBC_CMD(msg=0))
prob.writeLP(problem_name + ".lp" )
print("Status:", LpStatus[prob.status])
print("Optimal Solution to the problem: ", value(prob.objective))
print ("Individual decision_variables: ")
for v in prob.variables():
print(v.name, "=", v.varValue)
Which produces the following output:
Status: Optimal
Optimal Solution to the problem: 200.0
Individual decision_variables:
0 = 0.0
1 = 1.0
2 = 0.0
3 = 1.0
4 = 0.0
5 = 1.0
6 = 1.0
7 = 1.0
8 = 1.0
9 = 0.0
This seems to be a fairly standard "assignment" problem.
Let z_ij
be a set of binary variable representing if object i
is assigned to group j
.
Your objective then is to minimise the absolute value of deviations of the group-sums from their target values - working example code below:
from pulp import LpMaximize, LpMinimize, LpProblem, lpSum, LpVariable, PULP_CBC_CMD, value, LpStatus
data = [98,20,30,50,20,34,43,21,32,54]
n_object = len(data)
#object_keys = ["A" + str(i) for i in range(1, n_object + 1)]
object_keys = range(n_object)
group_sum_targets = [200, 100, 100]
group_n_objects = [6, 2, 2]
n_group = len(group_sum_targets)
group_keys = range(n_group)
problem_name = 'repex'
# Seek to minimise absolute deviation from the target sums
prob = LpProblem(problem_name, LpMinimize)
# Primary Decision variables - the assignments
z = LpVariable.dicts('z',
indexs = [(i, j) for i in object_keys for j in group_keys],
cat='Binary')
# Aux. decision variables
group_sums = LpVariable.dicts('group_sums', indexs=group_keys,
cat='Continuous')
group_abs_error = LpVariable.dicts('group_abs_error', indexs=group_keys,
cat='Continuous')
# Objective - assumes all groups evenly penalised for missing
# their target sum, and penalty for 'over' and 'under' have same
# weighting
prob += lpSum([group_abs_error[j] for j in group_keys])
# Constraints on groups
for j in group_keys:
prob += group_sums[j] == lpSum([z[(i, j)]*data[i] for i in object_keys])
prob += group_abs_error[j] >= group_sums[j] - group_sum_targets[j]
prob += group_abs_error[j] >= group_sum_targets[j] - group_sums[j]
# Constrain number of objects used
prob += lpSum([z[(i, j)] for i in object_keys]) == group_n_objects[j]
# Constraints of objects
for i in object_keys:
# Every object used exactly once
prob += lpSum([z[(i, j)] for j in group_keys]) == 1
# Solve
optimization_result = prob.solve(PULP_CBC_CMD(msg=0))
print("Status:", LpStatus[prob.status])
print("Optimal Solution to the problem: ", value(prob.objective))
print ("Individual decision_variables: ")
for v in prob.variables():
print(v.name, "=", v.varValue)
Which gives me the following (only printing the non-0 z's). As you can see groups have 6, 2, 2 objects as desired, and the sums are somewhat close to the targets.
Status: Optimal
Optimal Solution to the problem: 34.0
Individual decision_variables:
group_abs_error_0 = 9.0
group_abs_error_1 = 18.0
group_abs_error_2 = 7.0
group_sums_0 = 191.0
group_sums_1 = 118.0
group_sums_2 = 93.0
z_(0,_1) = 1.0
z_(1,_0) = 1.0
z_(2,_0) = 1.0
z_(3,_2) = 1.0
z_(4,_1) = 1.0
z_(5,_0) = 1.0
z_(6,_2) = 1.0
z_(7,_0) = 1.0
z_(8,_0) = 1.0
z_(9,_0) = 1.0