Search code examples
pythonpython-3.xor-toolsconstraint-programmingcp-sat

Spreading out shift assignments in constraint solver (ortools)


I used the Google OR-Tools Employee Scheduling script (thanks by the way) to make a on-call scheduler. Everything works fine and it is doing what it is supposed to. It makes sure each person works about the same amount of "shifts" (two week periods), it lets certain shifts be requested and I added a constraint where it won't let someone work shifts back to back.

Everyone gets the same amount of shifts (as much as possible):

df['Name'].value_counts()
Out[42]: 
Jeff     7
Bubba    7
Sarah    6
Scott    6
Name: Name, dtype: int64

One thing I notice is that it will use up a person as much as it can before moving on to the next. E.g it will go 1-2-1-2-1-3...3-4-3-2-3-4. As opposed to 1-2-3-4-1-2-3-4...

print(df)
     Name        Date    Shift
0   Sarah  01-09-2022  On Call
1   Scott  01-23-2022  On Call
2   Sarah  02-06-2022  On Call
3   Scott  02-20-2022  On Call
4   Sarah  03-06-2022  On Call
5    Jeff  03-20-2022  On Call
6   Sarah  04-03-2022  On Call
7    Jeff  04-17-2022  On Call
8   Sarah  05-01-2022  On Call
9    Jeff  05-15-2022  On Call
10  Sarah  05-29-2022  On Call
11   Jeff  06-12-2022  On Call
12  Bubba  06-26-2022  On Call
13   Jeff  07-10-2022  On Call
14  Bubba  07-24-2022  On Call
15   Jeff  08-07-2022  On Call
16  Scott  08-21-2022  On Call
17  Bubba  09-04-2022  On Call
18   Jeff  09-18-2022  On Call
19  Bubba  10-02-2022  On Call
20  Scott  10-16-2022  On Call
21  Bubba  10-30-2022  On Call
22  Scott  11-13-2022  On Call
23  Bubba  11-27-2022  On Call
24  Scott  12-11-2022  On Call
25  Bubba  12-25-2022  On Call

(See how it kind of burns up one person--like Sarah at the start. I would expect Scott to be a lot like this, because of the -1 in request to not be on call during a large stretch--but a more even spread amongst everyone else would be ideal).

So I have two questions:

  1. Is there a way to make it distribute the people more evenly?
  2. Also, can I add another level of constraint in here by identifying certain time periods that contain holidays and then equally distribute those as well (kind of like a shift inside a shift)?

Here is my script:

# %% imports
import pandas as pd
from ortools.sat.python import cp_model

# %% Data for the model
num_employees = 4
num_shifts = 1
num_oncall_shifts = 26
all_employees = range(num_employees)
all_shifts = range(num_shifts)
all_oncall_shifts = range(num_oncall_shifts)

dict_shift_name = {0: 'On Call'}

dict_emp_name = {0: 'Bubba', 1: 'Scott', 2: 'Jeff', 3: 'Sarah'}

dict_dates = {
    0: '01-09-2022',
    1: '01-23-2022',
    2: '02-06-2022',
    3: '02-20-2022',
    4: '03-06-2022',
    5: '03-20-2022',
    6: '04-03-2022',
    7: '04-17-2022',
    8: '05-01-2022',
    9: '05-15-2022',
    10: '05-29-2022',
    11: '06-12-2022',
    12: '06-26-2022',
    13: '07-10-2022',
    14: '07-24-2022',
    15: '08-07-2022',
    16: '08-21-2022',
    17: '09-04-2022',
    18: '09-18-2022',
    19: '10-02-2022',
    20: '10-16-2022',
    21: '10-30-2022',
    22: '11-13-2022',
    23: '11-27-2022',
    24: '12-11-2022',
    25: '12-25-2022'
    }

shift_requests = [
    [
     #Employee 0 Bubba
     #1/09  1/23  2/06  2/20  3/06  3/20  4/03  4/17  5/01  5/15  5/29  6/12
     [0],   [0],  [0],  [0],  [0],  [0],  [0],  [0],  [0],  [0],  [0],  [0],
     #6/26  7/10  7/24  8/07  8/21  9/04  9/18 10/02 10/16 10/30 11/13 11/27
     [0],   [0],  [0],  [0],  [-4],  [0],  [0], [0],  [0],  [0],  [0],  [0],
    #12/11 12/25
     [0],  [0]     
     ],
    
    [
     #Employee 1 Scott
     #1/09  1/23  2/06  2/20  3/06  3/20  4/03  4/17  5/01  5/15  5/29  6/12
     [0],[0],[0],[0],[0],[0],[0],[0],[0],[0],[-1],[-1],
     #6/26  7/10  7/24  8/07  8/21  9/04  9/18 10/02 10/16 10/30 11/13 11/27
     [-1],[-1],[-1],[-1],[-1],[-1],[-1],[0],[0],[0],[0],[0],
    #12/11 12/25
     [0],[0]     
     ],
    
    [
     #Employee 2 Jeff
     #1/09  1/23  2/06  2/20  3/06  3/20  4/03  4/17  5/01  5/15  5/29  6/12
     [0],[0],[0],[0],[0],[0],[0],[0],[0],[0],[0],[0],
     #6/26  7/10  7/24  8/07  8/21  9/04  9/18 10/02 10/16 10/30 11/13 11/27
     [0],[0],[0],[0],[-2],[0],[0],[0],[0],[0],[0],[0],
    #12/11 12/25
     [0],[0]     
     ],
    
    [
     #Employee 3 Sarah
     #1/09  1/23  2/06  2/20  3/06  3/20  4/03  4/17  5/01  5/15  5/29  6/12
     [0],[0],[0],[0],[0],[0],[0],[0],[0],[0],[0],[0],
     #6/26  7/10  7/24  8/07  8/21  9/04  9/18 10/02 10/16 10/30 11/13 11/27
     [0],[0],[0],[0],[-3],[0],[0],[0],[0],[0],[0],[0],
    #12/11 12/25
     [0],[0]     
     ],
]

# dataframe
df = pd.DataFrame(columns=['Name', 'Date', 'Shift'])

# %% Create the Model
model = cp_model.CpModel()

# %% Create the variables

# Shift variables# Creates shift variables.
# shifts[(n, d, s)]: nurse 'n' works shift 's' on day 'd'.
shifts = {}
for n in all_employees:
    for d in all_oncall_shifts:
        for s in all_shifts:
            shifts[(n, d, s)] = model.NewBoolVar('shift_n%id%is%i' % (n, d, s))

# %% Add constraints
# Each shift is assigned to exactly one employee in .
for d in all_oncall_shifts :
    for s in all_shifts:
        model.AddExactlyOne(shifts[(n, d, s)] for n in all_employees)
        
# Each employee works at most one shift per oncall_shifts.
for n in all_employees:
    for d in all_oncall_shifts:
        model.AddAtMostOne(shifts[(n, d, s)] for s in all_shifts)

# Try to distribute the shifts evenly, so that each employee works
# min_shifts_per_employee shifts. If this is not possible, because the total
# number of shifts is not divisible by the number of employee, some employees will
# be assigned one more shift.
min_shifts_per_employee = (num_shifts * num_oncall_shifts) // num_employees
if num_shifts * num_oncall_shifts % num_employees == 0:
    max_shifts_per_employee = min_shifts_per_employee
else:
    max_shifts_per_employee = min_shifts_per_employee + 1
for n in all_employees:
    num_shifts_worked = 0
    for d in all_oncall_shifts:
        for s in all_shifts:
            num_shifts_worked += shifts[(n, d, s)]
    model.Add(min_shifts_per_employee <= num_shifts_worked)
    model.Add(num_shifts_worked <= max_shifts_per_employee)

# "penalize" working shift back to back
for d in all_employees:
    for b in all_oncall_shifts[:-1]:
        for r in all_shifts:
            for r1 in all_shifts:
                model.AddImplication(shifts[(d, b, r)], shifts[(d, b+1, r1)].Not())

# %% Objective
model.Maximize(
    sum(shift_requests[n][d][s] * shifts[(n, d, s)] for n in all_employees
        for d in all_oncall_shifts for s in all_shifts))

# %% Solve
# Creates the solver and solve.
solver = cp_model.CpSolver()
status = solver.Solve(model)
if status == cp_model.OPTIMAL:
    print('Solution:')
    for d in all_oncall_shifts:
        print('On Call Starts: ', dict_dates[d])
        for n in all_employees:
            for s in all_shifts:
                if solver.Value(shifts[(n, d, s)]) == 1:
                    if shift_requests[n][d][s] == 1:
                        print(dict_emp_name[n], ' is ', dict_shift_name[s], '(requested).')
                    else:
                        print(dict_emp_name[n], ' is ', dict_shift_name[s],
                              '(not requested).')
                        
                    list_append = [dict_emp_name[n], dict_dates[d], dict_shift_name[s]]
                    
                    df.loc[len(df)] = list_append
        print()
    print(f'Number of shift requests met = {solver.ObjectiveValue()}',
          f'(out of {num_employees * min_shifts_per_employee})')
else:
    print('No optimal solution found !')

# %% Stats
print('\nStatistics')
print('  - conflicts: %i' % solver.NumConflicts())
print('  - branches : %i' % solver.NumBranches())
print('  - wall time: %f s' % solver.WallTime())

I get that it is difficult to program "fairness" into something, so any help would be greatly appreciated.

*** edit--added examples ***


Solution

  • There seem to be frequent complaints about lack of documentation, but there is some available on the Google OR-Tools site at https://developers.google.com/optimization .

    I learned a lot from the user manual from the old Google OR-Tools ConstraintSolver, to be found at https://www.scribd.com/document/482135694/user-manual-A4-v-0-5-2-pdf or Google "or-tools user_manual_A4.v.0.5.2 Nikolaj van Omme Laurent Perron Vincent Furnon". Even if you're using the new solver, a lot of concepts from that document still apply.

    The Google team maintains a list of recommended literature at https://developers.google.com/optimization/support/resources

    To do your two-level optimization:

    Define a new variable bestFirstObjective and constrain it to be equal to the expression of your optimization value:

    # here I simply use 1000 as the horizon for whatever maximum value would be plausible
    firstObjective = model.NewIntVar(0, 1000, "Objective1")
    model.Add(firstObjective == sum(shift_requests[n][d][s] * shifts[(n, d, s)] for n in all_employees
            for d in all_oncall_shifts for s in all_shifts))
    model.Maximize(bestFirstObjective)
    

    Then after solving, retrieve the value for firstObjective, and save it in a new Python integer bestFirstObjective (not an IntVar).

    Now in a new model2, add all the original variables and constraints, but instead of calling model.Maximize(bestFirstObjective) do this:

    model2.Add(firstObjective == bestFirstObjective)
    

    In this new model, construct the variables and constraints you need for the subsidiary objective, and then add:

    model2.Maximize(secondObjective)
    

    and solve.

    As Laurent Perron mentioned in Multiple objective functions with binary variables Google OR-tools, you can add the optimized solution to the original model as a hint during the second solution, that should speed things up. See https://developers.google.com/optimization/reference/python/sat/python/cp_model#addhint.

    I'm not sure if the second model is really needed, it might suffice to reuse the first one, it depends on whether calling Maximize a second time replaces the original objective or confuses the model, I simply don't know which is the case.

    In your last comment it sounds like you need more help to construct a measure for "evenly distributed", something like trying to maximize the lowest interval between two consecutive shifts of any employee. The challenge here is to compute the length of this interval from the Booleans that describe whether a shift is worked or not.

    Let's assume that the number of days between successive shifts should be "evenly distributed". This would be the case if the shortest break between any two shifts for all the employees would be as large as possible.

    One approach would be to first construct a Boolean which is true if an employee worked on day d (pseudo-code here, I haven't actually tested anything):

    # For each employee e and day d:
    workedOnDay[e, d] = model2.NewBoolvar("Worked on day xxx")
    model2.AddMaxEquality(workedOnDay[e, d], {shifts[(e, d, s)]  for s in all_shifts})
    

    new IntVar offDaysCounter for each employee/day. If the employee worked on the previous day (d-1) constrain its value to be 0. If the employee did not work on the previous day, constrain it to be the value of the previous day for that employee plus 1. This could be done something like this:

    On the first day there is no "previous day" so constrain offDaysCounter to be an arbitrary large number without condition.

    # For each employee e 
    offDaysCounter[e, 0] = model.NewIntVar(num_oncall_shifts, num_oncall_shifts, "OffDaysCounter xxx")
    

    On the subsequent days:

    # For each employee e and day d (except the first day)
    offDaysCounter[e, d] = model.NewIntVar(0, 2 * num_oncall_shifts, "OffDaysCounter xxx")
    model2.Add(offDaysCounter[e, d] == 0).OnlyEnforceIf(workedOnDay[e, (d - 1)])
    model2.Add(offDaysCounter[e, d] == offDaysCounter[e, (d - 1)] + 1).OnlyEnforceIf(workedOnDay[e, (d - 1)].Not())
    

    The variable will count up on the off days and be reset to 0 on the worked days; its value just before the next worked day is the length of the break.

    Now we can compute the actual number of off days between shifts. Create a new IntVar offDays for each employee/day. If the employee did not work on a day, set its value to be equal to offDays for the following day. If the employee worked on a day, set its value to be equal to offDaysCounter for the same day. This variable will now have a value on each day equal to the break between the last and next worked day relative to this day.

    # For each employee e and day d (except the last day):
    offDays[e, d] = model.NewIntVar(0, 2 * num_oncall_shifts, "OffDays xxx")
    model.Add(offDays[e, d] == offDays[e, (d + 1)]).OnlyEnforceIf(workedOnDays[e, d].Not())
    model.Add(offDays[e, d] == offDaysCounter[e, d]).OnlyEnforceIf(workedOnDays[e, d])
    

    On the last day, there is no following day, so set the value of offDays to an arbitrary large number (like the total number of days), if the day was not worked, otherwise to the offDaysCounter[e, d]

    The result could look something like this for a given employee:

    day 0 1 2 3 4 5 6 7 8
    workedOnDay 1 0 0 1 0 1 0 0 0
    offDaysCounter 18 0 1 2 0 1 0 1 2
    offDays 18 2 2 2 1 1 18 18 18

    Now we can construct the objective: maximize the minimum interval between any employees shifts:

    secondObjective = model.NewIntVar(0, 2 * num_oncall_shifts, "second objective")
    model.AddMinEquality(secondObjective, {offDays[e, d] for d in all_oncall_shifts, for e in all_employees})
    model.Maximize(secondObjective)
    

    As I said, I haven't actually tested anything...