Search code examples
pythongoogle-cloud-platformruntimeschedulingor-tools

Reducing run-time of Google OR-tools python script (using Google Cloud)


I'm developing a nurse scheduling program in Python for one of the departments of the hospital I work with. Various examples of such programs already exist and are shared online. One of these is the following: https://github.com/google/or-tools/blob/master/examples/python/shift_scheduling_sat.py

So far I have adapted the code in the link above to include various types of labour regulations as well as individual nurse preferences. Now I would like to use this tailored script to produce rosters for a workforce of 25 nurses over a 7 week period (5 shift types, can be reduced to 4).

However, implementing particular types of constraints leads to a significant increase in run-time. These constraints are:

  • Constraint on length of series of morning/evening/night shift:

     shift_constraints = [
    
          #Morning shifts
          (1, 2, 2, 0, 4, 4, 0),
          #Evening shifts
          (2, 2, 2, 0, 4, 4, 0),
          Night shifts
          (3, 1, 2, 2, 5, 5, 0)
    
    ]
    
  • Constraint on days off. I would like to prevent the scheduling of single days off by adding to the list of shift constraints:

    (0, 1, 2, 2, 10, 10, 0)
    
  • Force that weekends, both Saturday & Sunday, are scheduled off:

            for e in range(num_employees):
              for d in range(num_days):
                  if ( ( d in weekend_day ) & ( ( d+1 ) in weekend_day) ):
                      model.Add(work[e, 0, d + 1] == 1 ).OnlyEnforceIf(work[e, 0, d])
    
  • Force that employees have 2 days off after a series of 3 consecutive night-shifts

            for e in range(num_employees):
              for d in range(num_days):
                  if ((d > 3) and (d<45)):
                      model.Add(work[e, 0, d] == 1).OnlyEnforceIf(work[e, 3, d-3] and work[e, 3, d-2] and work[e, 3, d-1])
                      model.Add(work[e, 0, d + 1] == 1).OnlyEnforceIf(work[e, 3, d-3] and work[e, 3, d-2] and work[e, 3, d-1])
    
  • Force that employees cannot work more than 7 consecutive days:

    max_seq_length = 7
      for e in range(num_employees):
          works = [work[e, 0, d].Not() for d in range(num_days)]
          variables, coeffs = add_soft_sequence_constraint(
                 model, works, 0, 0, 0, max_seq_length, max_seq_length, 0, 'shift_constraint(employee %i, shift %i)' % (e, 0))
          # model, works, hard_min, soft_min, min_cost, soft_max, hard_max, #max_cost, 'shift_constraint(employee %i, shift %i)' % (23 shift))
          obj_bool_vars.extend(variables)
          obj_bool_coeffs.extend(coeffs)
    

Running the script without any of these constraints takes less than 1 minute. However, when adding all of them to the script simultaneously, it may take more than 48 hours to find a solution. Therefore I was wondering whether it is possible to reduce run-time? If it helps, I don't necessarily require an optimal solution. Since I don't make use of penalized constraints much, any solution that satsfies the specified constraints will do.


Solution

  • I finally managed to solve the problem. Below I describe my approach using Google Services Compute Engine.

    Before I landed on the solution there are couple of things I tried:

    • To start I optimized the code in terms of run-time. For example, I removed a shift-type that is assigned only a couple times a month, to reduce the search space. Also I removed individual specific shift rotation rules and length-of-series constraints, since these cause major increases in run-time
    • I tried various settings of the "num_search_workers" parameter. Be aware that in the latest version of the script this parameter is no longer included in the code, and has to be added manually.

    However, these changes didn't lead to the desired reduction in run-time. It was clear the code was fine but I simply didn't have enough computing power. Therefore I considered how to run the script on a more powerful device. I started by trying to run it in Google Colab, but here run-times were even longer than on my own device

    Then I decided to try the Compute Engines on Google Cloud Services. GCS allows to set up a Virtual Machine dedicated to CPU intensive tasks (with custom specifications), in a few clicks. Also, if you have never used the service before, you get a 300 dollar credit to rent servers. If you sign up, by default you get assigned a test account that has some limitations on the specifications of the VM's. However, if you switch to a paid account, the credit is added to you account and the limitations are lifted. This way I had access to machines that were orders of magnitude more powerful than my laptop.

    For those interested in using GCS to run python scripts, I would recommend watching this video as introduction:

    Also, Google provides an introduction on using VM's in this codelab:

    These video's cover how to set-up python on the Debian OS:

    If you are setting up Python, don't forget to create a virtual environment to install packages and run scripts:

    Once I figured out how to run scripts on a VM, I experimented with VM configurations/num-search worker settings. Don't forget that if you use a multi-core processor, you can increase the number of search-workers, leading to lower run-times.

    I noticed that already the 8-core (30 GB RAM) computation dedicated system (provided in the "trial account") was 20-30 faster than my laptop in terms of run-time. However, I decided to use an even more powerful VM. Using the 30-core setup (130 GB RAM), the script was done running in 20 seconds, instead of 48 hours.