Search code examples
pythonpulp

"KeyError: (0, 'num')" when using List Comprehension to create PuLP LPConstraint


I thought what I was creating for optimization was a fairly simple problem. I have the following data in a csv file:

Simple MK Ride Information

I'm trying to create an "optimal" day of rides using the PuLP package, but keep getting errors when creating one of my constraints. The constraint simply summarized is: The amount of time on each ride (# times ridden * ride time, plus # times ridden * wait time), plus 15 minutes between rides, has to be equal to or less than the time the park is open (minus a fixed amount of time for meals and restrooms)

I keep getting a KeyError: (0, 'num') when I run the line for the constraint. (I can print the ride_list.loc[0,'num'] without a problem - first step to troubleshoot.) Any tips or thoughts would be appreciated.

import pandas as pd
import pulp

ride_list = pd.read_csv('RideData.csv')
num_rides = ride_list.shape[0]
rides = list(range(num_rides))
min_times = 0 #minimum # times for each ride
max_times = 2 #maximum # times for each ride
time_between = 0.25 #how long to allow between rides
food_time = 2.5 #how long to allow for meals
hours_open = 12 #how many hours the park is open

# Set default values for testing
for i in rides:
    ride_list.loc[i, 'my_min'] = min_times
    ride_list.loc[i, 'my_max'] = max_times
    ride_list.loc[i, 'my_rating'] = ride_list.loc[i, 'rating']
    ride_list.loc[i, 'num'] = 0

# Initialize model
model = pulp.LpProblem('Maximimize ride enjoyment', pulp.LpMaximize)

# Add calculation to be optimized
model += pulp.lpSum([ride_list.loc[i,'ride'] * ride_list.loc[i,'rating'] * ride_list.loc[i,'num'] for i in rides])

# Add constraint to # times for each ride
x = pulp.LpVariable.dicts("times",[ride_list.loc[i, 'num'] for i in rides], 
                          lowBound = min_times,
                          upBound = max_times,
                          cat = 'Integer')

# Add constraint for total amount of time
total_rides = pulp.lpSum([ride_list[i,'num'] for i in rides])
model += pulp.LpConstraint('total_ride_time',
                          (pulp.lpSum([((ride_list.loc[i,'wait'] * ride_list.loc[i,'num']) + 
                          (ride_list.loc[i,'ride'] * ride_list.loc[i,'num'])) for i in rides]) + 
                          ((total_rides - 1) * time_between)) <= 
                          (hours_open - food_time))

Solution

  • So the problem you are having with the indexing is due to the fact that you are not constructing the variable correctly in pulp. When you do this:

    x = pulp.LpVariable.dicts("times",[ride_list.loc[i, 'num'] for i in rides], ...
    

    you are passing in a list of the values of the data frame, which is incorrect. The call should provide a key set for the variable to be created, in this case, x and the logical index is just a count of the number of rides. In your case, you are providing a list of 3 zeros, as you have filled the df with those values. That collapses down to a set with one zero in it. See my example below. Also, in the remainder of your effort, you are not using the variable you created... ?? You are using ride_list.loc[i,'num'] which is just some fixed value in a data frame, not a variable in the problem.

    Here is an example that shows a better way to construct the variable, etc. Note the output results from printing x [poorly formed] and y. Also note the usage of the variable created in a notional constraint.

    import pulp
    import pandas as pd
    
    data = { 'wait':        [1, 5, 10],
             'ride_time':   [2, 6, 8],
             'num' :        [0, 0, 0]}
    
    df = pd.DataFrame(data)
    print (df)
    
    model = pulp.LpProblem("ride_plan", pulp.LpMaximize)
    x = pulp.LpVariable.dicts("rides", [df.loc[i, 'num'] for i in range(len(df))])  # <-- bad
    print(x)
    
    y = pulp.LpVariable.dicts("rides", range(len(df)), cat="Integer")
    print(y)
    
    # example constraint for notional max wait time of 15 minutes...
    model += pulp.LpConstraint(pulp.lpSum(df.loc[i, 'wait'] * y[i] for i in range(len(df)))) <= 15
    
    print(model)
    

    Yields:

       wait  ride_time  num
    0     1          2    0
    1     5          6    0
    2    10          8    0
    
    {0: rides_0}    # <--- x
    {0: rides_0, 1: rides_1, 2: rides_2}   # <--- y
    
    ride_plan:
    MAXIMIZE
    None
    SUBJECT TO
    _C1: rides_0 + 5 rides_1 + 10 rides_2 <= 15
    
    VARIABLES
    rides_0 free Integer
    rides_1 free Integer
    rides_2 free Integer
    
    [Finished in 220ms]