python scheduling linear-programming pulp

Ensuring that all positive values in a binary vector are next to each other during PulP Optimization

I have a scheduling optimisation problem where I need to optimise Activities to be performed in a given timeframe based on priority. In the problem below, there is enough time to perform all activities, so all should be performed.

The issue is that the optimisation is breaking up the activities instead of suggesting schedules where the whole duration of the activity occurs in one go.

import pulp

# Define activities with durations, resource requirements, priorities, and parallel constraints
# Resources are per timeframe (ie, if every hour you need 1 machine, the value is 1 not 1*hours)
# priority is total
activities = {
    "Activity A": {
        "duration": 2,
        "resources": {"resource1": 1, "resource2": 1},
        "priority": 3,
    },
    "Activity B": {
        "duration": 3,
        "resources": {"resource1": 2, "resource2": 1},
        "priority": 2,
    },
    "Activity C": {
        "duration": 1,
        "resources": {"resource1": 1, "resource2": 2},
        "priority": 1,
    },
    "Activity D": {
        "duration": 2,
        "resources": {"resource1": 1, "resource2": 1},
        "priority": 4,
    },
}

for name, value in activities.items():
    activities[name]["priority"] = (
        activities[name]["priority"] / activities[name]["duration"]
    )

# Define the time horizon
time_horizon = 10

# Create a LP problem
problem = pulp.LpProblem("ScheduleOptimization", pulp.LpMaximize)

# Create binary variables for each activity and time slot
activity_vars = {}
for activity in activities:
    for t in range(time_horizon):
        activity_vars[(activity, t)] = pulp.LpVariable(
            f"{activity}_{t}", 0, 1, pulp.LpInteger
        )

# Create a variable to represent the total priority
total_priority = pulp.LpVariable("TotalPriority", cat=pulp.LpContinuous)

# Objective: Maximize the total priority
problem += total_priority

# Constraints
## Activities can run a maximum of their duration
for activity_name, activity in activities.items():
    problem += (
        pulp.lpSum(activity_vars[(activity_name, tt)] for tt in range(0, time_horizon))
        <= activity["duration"]
    )

# Update total_priority variable to reflect the actual total priority
problem += total_priority == pulp.lpSum(
    activity["priority"] * activity_vars[(activity_name, t)]
    for activity_name, activity in activities.items()
    for t in range(time_horizon)
)

# Solve the problem
problem.solve(pulp.PULP_CBC_CMD(msg=1))

# Print the schedule
schedule = {}
for activity_name in activities:
    for t in range(time_horizon):
        print(
            activity_vars[(activity_name, t)],
            "___",
            pulp.value(activity_vars[(activity_name, t)]),
        )
        if (
            pulp.value(activity_vars[(activity_name, t)]) == 1
            and schedule.get(activity_name) is None
        ):
            schedule[activity_name] = t

print("Optimal Schedule:")
for activity, start_time in schedule.items():
    print(f"{activity} starts at time {start_time}")
print(f"Total Priority: {pulp.value(total_priority)}")

I expect all activities to behave like activity B, which starts and finish in its duration (ie, all 1s in the vector [Activity_B_0 to Activity_B_9] are together -> [0,0,0,0,0,1,1,1,0,0]

Activity A on the other hand is broken down despite breaking no constraints if it happened all in one go. [1,0,1, ...] should be [1,1,0, ...]

Below is an example solution where the variable Activity_A_0 refers to the Activity A being active at time 0. If the duration of the activity is 2 then the sum of Activit_A_x should == 2.

Activity_A_0 ___ 1.0 #Not continuous
Activity_A_1 ___ 0.0 #Not continuous
Activity_A_2 ___ 1.0 #Not continuous
Activity_A_3 ___ 0.0
Activity_A_4 ___ 0.0
Activity_A_5 ___ 0.0
Activity_A_6 ___ 0.0
Activity_A_7 ___ 0.0
Activity_A_8 ___ 0.0
Activity_A_9 ___ 0.0
Activity_B_0 ___ 0.0
Activity_B_1 ___ 0.0
Activity_B_2 ___ 0.0
Activity_B_3 ___ 0.0
Activity_B_4 ___ 0.0
Activity_B_5 ___ 1.0 #Continuous
Activity_B_6 ___ 1.0 #Continuous
Activity_B_7 ___ 1.0 #Continuous
Activity_B_8 ___ 0.0
Activity_B_9 ___ 0.0
Activity_C_0 ___ 0.0
Activity_C_1 ___ 0.0
Activity_C_2 ___ 0.0
Activity_C_3 ___ 0.0
Activity_C_4 ___ 0.0
Activity_C_5 ___ 0.0
Activity_C_6 ___ 0.0
Activity_C_7 ___ 0.0
Activity_C_8 ___ 1.0
Activity_C_9 ___ 0.0
Activity_D_0 ___ 0.0
Activity_D_1 ___ 1.0 # Not Continuous
Activity_D_2 ___ 0.0 # Not Continuous
Activity_D_3 ___ 1.0 # Not Continuous
Activity_D_4 ___ 0.0
Activity_D_5 ___ 0.0
Activity_D_6 ___ 0.0
Activity_D_7 ___ 0.0
Activity_D_8 ___ 0.0
Activity_D_9 ___ 0.0

Solution

A standard and often used approach to have all ones together is to limit the number of occurrences of the pattern 0 1 (often called start-ups in machine scheduling) in the binary variables x[t]. This can be modeled with

 s[t] >= x[t] - x[t-1] 
 sum(s[t]) <= 1
 x[t],s[t] binary

The objective function does not impact the formulation. This approach works with any objective function.
We can relax s[t] to be continuous between 0 and 1.
The very first x[t-1] is exogenous (i.e. data).