Linear programming solver ignore constraints

I am working on a small project to learn linear programming.

I have formulated a problem where we want to assign P people to N projects. Each person give a preference, 2 (strongly desired, 1 desired, 0 not desired) for each of the N project.

We define preferences as a matrix of size PxN. then, we create a boolean matrix D PxN that represent the decision, of in which project each person is going to be.

    from pulp import *
    import pandas as pd
    
    preferences = [
                    [2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
                    [1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
                    [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                    [0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0],
                    [0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 2],
                    [0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
                    [0, 0, 0, 0, 1, 0, 2, 0, 2, 0, 0, 1, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0]
                    ]
    
    N = len(preferences[0]) # Number of projects
    P = len(preferences) # Number of people
    D = [[LpVariable("D[%i,%i]" % (i, j), 0, 1, LpInteger) for j in range(N)] for i in range(P)]

And I'd like to maximize preferences * D (where * is the element-wise multiplication).

I am defining some constraints:

A person can be assigned to only 1 project


    for i in range(P):
        prob += lpSum(D[i][j] for j in range(N)) == 1

A project can't have only one person, (but a project can have 0 people assigned to it, which means that the project is discarded).

To enforce the second constraint, I followed this suggestion. https://math.stackexchange.com/questions/37075/how-can-not-equals-be-expressed-as-an-inequality-for-a-linear-programming-model.


    B = 100 # large value
    Z = [LpVariable("Z[%i]" % j, 0, 1, LpInteger) for j in range(N)]  #Auxiliary boolean
    for j in range(N):
        prob += lpSum(D[i][j] for i in range(P)) <= B * Z[j]
        prob += lpSum(D[i][j] for i in range(P)) >= 2-B+B*Z[j]

So far everything seems to work, the result D that maximize the function is a binary matrix that respect all the constraints.

Example output for D:

1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0

The problems come when I try to penalize groups of size different from 3, the more they get bigger or smaller than 3. (I'd actually like to do it for 3 and 4, starting from penalizing only at size 2 and size >=5, but better start simple).

To do so, I create a vector "Size penalty" (SP) and I try to enforce

|Sum(D[i][j] for i in range(P)) - 3| <= SP[j]

breaking down this modulo into two equations

Sum(D[i][j] for i in range(P)) - 3 <= SP[j] and
Sum(D[i][j] for i in range(P)) - 3 >= -SP[j]


    SP1 = [LpVariable("SP[%i]" % j, 0, 1, LpInteger) for j in range(N)]
    for j in range(N):
        prob += lpSum(D[i][j] for i in range(P)) - 3 <= SP1[j]
        prob += lpSum(D[i][j] for i in range(P)) - 3 >= -SP1[j]

and adding the Sum(SP) to the function to maximize

f = M * D - SUM(SP).


    prob += lpSum(D[i][j] * preferences[i][j] for i in range(P) for j in range(N)) - lpSum(SP1[j] for j in range(N))

When I add this, the model still works, but it somehow breaks the constraint of D being an integer between 0 and 1. Example of broken output:

1.0 -3.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 
1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 -1.0 0.0 0.0 
0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 -1.0 0.0 
0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 1.0 0.0 -2.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 -8.0 9.0 
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -1.0 1.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 1.0 0.0 -1.0 0.0 1.0 0.0 -13.0 0.0 0.0 1.0 0.0 12.0 0.0 
0.0 1.0 -1.0 1.0 1.0 -3.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 
-1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 2.0 0.0 0.0 0.0 -2.0 0.0 
1.0 0.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 3.0 1.0 -8.0

How can this be, considering that I defined D as Lp integers?


    D = [[LpVariable("D[%i,%i]" % (i, j), 0, 1, LpInteger) for j in range(N)] for i in range(P)]

Full code:


    from pulp import *
    import pandas as pd
    
    
    preferences = [
                    [2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
                    [1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
                    [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                    [0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                    [0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0],
                    [0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 2],
                    [0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
                    [0, 0, 0, 0, 1, 0, 2, 0, 2, 0, 0, 1, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0],
                    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0]
                    ]
    
    N = len(preferences[0]) # Number of projects
    P = len(preferences) # Number of people
    B = 100 # B large value
    
    # Create the 'prob' variable to contain the problem data
    prob = LpProblem("Project Assignment", LpMaximize)
    
    
    # Variables
    # D[i, j] = 1 if person i is assigned to project j and 0 otherwise. To be optimized
    D = [[LpVariable("D[%i,%i]" % (i, j), 0, 1, LpInteger) for j in range(N)] for i in range(P)]
    
    #Define auxiliary Z[N] for inequality
    Z = [LpVariable("Z[%i]" % j, 0, 1, LpInteger) for j in range(N)]
    
    #Define size penalty SP[N] variable as int for each project
    SP1 = [LpVariable("SP[%i]" % j, 0, 1, LpInteger) for j in range(N)]
    # Constraints
    
    # Each person is assigned to exactly one project, use Add(sum(D[i, j]) == 1) for each person i
    for i in range(P):
        prob += lpSum(D[i][j] for j in range(N)) == 1
    
    # Each project has not 1 person assigned. Use auxiliary variable z to express inequality
    # https://math.stackexchange.com/questions/37075/how-can-not-equals-be-expressed-as-an-inequality-for-a-linear-programming-model
    for j in range(N):
        prob += lpSum(D[i][j] for i in range(P)) <= B * Z[j]
        prob += lpSum(D[i][j] for i in range(P)) >= 2-B+B*Z[j]
    
    #Add penalty the more the number of people is different from 3 for each project
    for j in range(N):
        prob += lpSum(D[i][j] for i in range(P)) - 3 <= SP1[j]
        prob += lpSum(D[i][j] for i in range(P)) - 3 >= -SP1[j]
    
    
    # Objective
    # Maximize the total preference score
    prob += lpSum(D[i][j] * preferences[i][j] for i in range(P) for j in range(N)) - lpSum(SP1[j] for j in range(N))
    
    # Solve
    status = prob.solve()
    
    #print D.values()
    for i in range(P):
        for j in range(N):
            print(D[i][j].value(), end=" ")
        print()
    
    #print Z.values()
    print("Zeta values")
    for j in range(N):
        print(Z[j].value(), end=" ")
    
    #Assert that sum(D[i][j]) over i is != 1 for each column j
    for j in range(N):
        assert sum(D[i][j].value() for i in range(P)) != 1

Solution

Nice question, with examples and reproducible code!

Your main issue is with the construction of the SP1 variable. But before we get to that, the main thing you are not doing is inspecting the solver status, which you must do before looking at results. When you execute your model, the status returned is infeasible so whatever result the beast puts out are total junk and should be disregarded.

So why is it infeasible? Well, because you (unintentionally) limited SP1 to {0, 1} by defining limits, where in fact the "delta" could be up to the number of projects. I changed that and it solves instantly.

I also tweaked a couple of small things (see comments) and showed a few accelerants (where you can just use LpSum(<var>) if you want all of it without specifying indices).

I also added an additional binary indicator if a project is done by anyone so you can count the number of complete projects if that is of interest. And for giggles, showed an alternate objective function you might consider. If you go that route, be careful with the weighting... You need to do a little stubby-pencil math to make sure the bonus/penalty doesn't eclipse getting things done. Anyhow.... Fixed:

Code:

from pulp import *
# import pandas as pd


preferences = [
                [2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
                [1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
                [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
                [0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                [0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0],
                [0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
                [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 2],
                [0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
                [0, 0, 0, 0, 1, 0, 2, 0, 2, 0, 0, 1, 0, 0],
                [0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0],
                [0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
                [0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0],
                [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0]
                ]

N = len(preferences[0]) # Number of projects
P = len(preferences) # Number of people
B = N + 1 # <-- this is "big enough" and grows with the model

# Create the 'prob' variable to contain the problem data
prob = LpProblem("Project Assignment", LpMaximize)


# Variables
# D[i, j] = 1 if person i is assigned to project j and 0 otherwise. To be optimized
D = [[LpVariable("D[%i,%i]" % (i, j), cat=LpBinary) for j in range(N)] for i in range(P)]  # <- LpBinary

# Indicator if project is complete
proj_complete = LpVariable.dicts('proj_complete', list(range(N)), cat=LpBinary) # <- I like this format for making vars, but yours works too.

#Define auxiliary Z[N] for inequality
Z = [LpVariable("Z[%i]" % j, cat=LpBinary) for j in range(N)] # <- LpBinary

#Define size penalty SP[N] variable as int for each project
SP1 = [LpVariable("SP[%i]" % j, 0, N, LpInteger) for j in range(N)]
# Constraints

# Each person is assigned to exactly one project, use Add(sum(D[i, j]) == 1) for each person i
for i in range(P):
    prob += lpSum(D[i][j] for j in range(N)) == 1

# Each project has not 1 person assigned. Use auxiliary variable z to express inequality
# https://math.stackexchange.com/questions/37075/how-can-not-equals-be-expressed-as-an-inequality-for-a-linear-programming-model
for j in range(N):
        prob += lpSum(D[i][j] for i in range(P)) <= B * Z[j]
        prob += lpSum(D[i][j] for i in range(P)) >= 2 - B + B * Z[j]

#Add penalty the more the number of people is different from 3 for each project
for j in range(N):
    prob += lpSum(D[i][j] for i in range(P)) - 3 <= SP1[j]   # catch the "overage"
    prob += lpSum(D[i][j] for i in range(P)) - 3 >= -SP1[j]  # catch the "underage"

# link the assignment variable to project completion:
for project in range(N):
    prob += proj_complete[project] <= sum(D[person][project] for person in range(P))


# Objective
# Maximize the total preference score
# prob += lpSum(D[i][j] * preferences[i][j] for i in range(P) for j in range(N)) - lpSum(SP1[j] for j in range(N))

# alternate objective:  get as many projects done as possible, some bonus for preferences and penalties for over/under assign
proj_assigned, pref_bonus, staff_penalty = 1.0, 0.2, 0.1
prob +=     proj_assigned * lpSum(proj_complete) \
          + pref_bonus * lpSum(D[i][j] * preferences[i][j] for i in range(P) for j in range(N)) \
          - staff_penalty * lpSum(SP1)
# Solve
status = prob.solve()

#print D.values()
for i in range(P):
    for j in range(N):
        print(D[i][j].value(), end=" ")
    print()

#print Z.values()
print("Zeta values")
for j in range(N):
    print(Z[j].value(), end=" ")
print()

print("deltas from 3 assigned people")
for j in range(N):
    print(SP1[j].value(), end=" ")

print()
print(f'total projects completed: {value(lpSum(proj_complete))} of {N}')

#Assert that sum(D[i][j]) over i is != 1 for each column j
for j in range(N):
    assert sum(D[i][j].value() for i in range(P)) != 1

Output:

Result - Optimal solution found

Objective value:                7.10000000
Enumerated nodes:               566
Total iterations:               9430
Time (CPU seconds):             1.73
Time (Wallclock seconds):       1.80

Option for printingOptions changed from normal to all
Total time (CPU seconds):       1.73   (Wallclock seconds):       1.81

1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 
Zeta values
1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 1.0 1.0 
deltas from 3 assigned people
1.0 3.0 3.0 3.0 3.0 3.0 3.0 0.0 1.0 1.0 3.0 3.0 1.0 1.0 
total projects completed: 6.0 of 14
[Finished in 1.9s]