Search code examples
pythonlinear-programmingpulpinteger-programming

Implementing MILP in PULP with restricted choices of group


I am trying to implement the following problem in PULP:

https://math.stackexchange.com/questions/4232797/optimising-class-assignment-based-on-test-score-and-class-choice/4233234#4233234

Here is my attempt so far. I am not sure how to build the dictionary "student_allocations" into my restrictions to stop students being assigned to groups they did not choose. Any help is appreciated.

import pandas as pd
import numpy as np
from pulp import * 

# student ID number
student_id = np.arange(1,31)

# Test Scores
scores = [15, 29, 18, 74, 66, 89, 3, 77, 78, 70, 68, 47, 71, 37, 96, 27, 76, 25, 95, 16, 32, 11, 81, 82, 21, 57, 8, 5, 55, 31]

# Test scores mapped to student ID to make parameter s_i
s_i= dict(zip(student_id, scores))

# Groups
GROUPS = ['A','B','C','D','E','F'] 

# Student ID mapped to Group choices
student_allocations = {1: ['F', 'D'], 2: ['F', 'E', 'D', 'A', 'B', 'C'], 3: ['E', 'D'], 4: ['D', 'E', 'C', 'B', 'A'], 5: ['F', 'C', 'D'], 6: ['E', 'D', 'C', 'A'], 7: ['A'], 8: ['C', 'A', 'D', 'E', 'F'], 9: ['F', 'B'], 10: ['D', 'E'], 11: ['A', 'E', 'C', 'B', 'D'], 12: ['D', 'E', 'A', 'F'], 13: ['E'], 14: ['C', 'F', 'D'], 15: ['E', 'A', 'F', 'C', 'D'], 16: ['C', 'D', 'F', 'A', 'E'], 17: ['E', 'F'], 18: ['B'], 19: ['C', 'E', 'B', 'D'], 20: ['F', 'E'], 21: ['E', 'A', 'B', 'D', 'F', 'C'], 22: ['D', 'B', 'F', 'E', 'C', 'A'], 23: ['D', 'A', 'F', 'B', 'C'], 24: ['E', 'F', 'B', 'D', 'A'], 25: ['C'], 26: ['F', 'E', 'B'], 27: ['A', 'D', 'B'], 28: ['E', 'B', 'C', 'D'], 29: ['A', 'B', 'F', 'C', 'E', 'D'], 30: ['A', 'F', 'B', 'D']}


# setting the problem variable
prob = LpProblem("Timetabling", LpMinimize)

# define decision variable
x_ic = LpVariable.dicts("InClass", [(i,c) for i in student_id
                                          for c in GROUPS],
                       0,1,LpBinary)             

# define dummy min, max variables
l_c = LpVariable.dicts("lowest group score",GROUPS,0)      
h_c = LpVariable.dicts("highest group score",GROUPS,0)    

# define the objective function 
prob += lpSum(h_c[c] - l_c[c] for c in GROUPS) 

# setting the constraints
# constraint 1      
for i in student_id:
    prob += lpSum(x_ic[(i,c)] for c in GROUPS) == 1 
       
# constraint 2
N = 10  # maximum class size
for c in GROUPS:
    prob += lpSum(x_ic[(i,c)] for i in student_id) <= N
    
# constraint 3
for i in student_id:
    for c in GROUPS:
        prob += (l_c[c] - s_i[i]) <= (1-s_i[i]) * (1-x_ic[(i,c)])   

# constraint 4
for i in student_id:
    for c in GROUPS:
        prob += h_c[c] >= s_i[i] * x_ic[(i,c)]

# solve the problem
prob.solve()
print("Status: ", LpStatus[prob.status])

# output allocations
TOL = 0.000001 
for i in student_id:
    for c in GROUPS:
        if x_ic[(i,c)].varValue > TOL:
            print(i, c)

EDIT: Full solution given below with Method 3 incorporated into decision variable, constraints and output.

# setting the problem variable
prob = LpProblem("Timetabling", LpMinimize)

# define decision variable
x_ic = LpVariable.dicts("InClass", [(i,c) for i in student_id
                                          for c in student_allocations[i]
                                          ],
                       0,1,LpBinary)              

# define dummy min, max variables
l_c = LpVariable.dicts("lowest group score",GROUPS,0)      
h_c = LpVariable.dicts("highest group score",GROUPS,0)   


# define the objective function 
prob += lpSum(h_c[c] - l_c[c] for c in GROUPS) 

# setting the constraints
            
# constraint 1      
for i in student_id:
    prob += lpSum(x_ic[(i,c)] for c in GROUPS if c in student_allocations[i]) == 1 
       
# constraint 2
N = 10 # maximum class size
for c in GROUPS:
        prob += lpSum(x_ic[(i,c)] for i in student_id if c in student_allocations[i]) <= N
    
# constraint 3
for i in student_id:
    for c in GROUPS:
        if c in student_allocations[i]:
            prob += (l_c[c] - s_i[i]) <= (1-s_i[i]) * (1-x_ic[(i,c)])   

# constraint 4
for i in student_id:
    for c in GROUPS:
        if c in student_allocations[i]:
            prob += h_c[c] >= s_i[i] * x_ic[(i,c)]

# solve the problem
prob.solve()
print("Status: ", LpStatus[prob.status])

# output allocations
TOL = 0.000001 
for i in student_id:
    for c in GROUPS:
        if c in student_allocations[i]:
            if x_ic[(i,c)].varValue > TOL:
                print(i,c)  

Solution

  • There are many ways to do this, some obvious, some a bit more work.

    Method 1: add constraints

    Add constraints that forbid assignments to a group that is not allowed:

    for i in student_id:
        for c in GROUPS:
            if not (c in student_allocations[i]):
                prob += x_ic[(i,c)] == 0
    

    This is rather obvious, and I think you should have been able to figure this out.

    Method 2: add bounds

    A slightly more sophisticated approach would be to set upper bounds.

    for i in student_id:
        for c in GROUPS:
            if not (c in student_allocations[i]):
                 x_ic[(i,c)].upBound = 0
    

    This is a little bit more efficient, as no constraints are being generated, but rather simple upper bounds. A solver will typically convert singleton constraints to bounds, but this can save some effort generating the constraints.

    Method 3: sparse variables

    This is the most efficient way. Only generate the allowed variables. Change the variable to:

    # define decision variable
    x_ic = LpVariable.dicts("InClass", [(i,c) for i in student_id
                                              for c in GROUPS
                                              if c in student_allocations[i]],
                           0,1,LpBinary)             
    

    or shorter:

    # define decision variable
    x_ic = LpVariable.dicts("InClass", [(i,c) for i in student_id
                                              for c in student_allocations[i]],
                           0,1,LpBinary)             
    

    You need to protect the corresponding constraints as well. For large models, this is the best approach. For production models, I always use this method.