Search code examples
pythonlinear-programmingpyomopulp

Python define variable name and condition based on excel inputs


I am working on LP Problem in python.

Allotment choices:

Type          Qty_Available   Cost  For_Protein    For_Carb
Protein_Can         50         10      TRUE          FALSE
Soup                15          8      FALSE         TRUE
EnergyDrink         10         15      TRUE          TRUE
            

Dataframe

StudentID   Age ProteinDef  CarbDef
A01         16      TRUE     TRUE
A02         17      FALSE    TRUE
A03         16      FALSE    FALSE
A04         18      TRUE     TRUE
A05         16      FALSE   TRUE
B01         18      FALSE   FALSE
B02         18      TRUE    TRUE
B03         20      FALSE   TRUE
B04         19      FALSE   FALSE
B05         19      TRUE    FALSE

My code is working well. The issue I am facing is that allotment choice change every month based on availability and is provided in excel. But, my code is hard coded (shown below)

Protein_Can = pulp.LpVariable.dicts("Protein_Can", df.StudentID, pulp.LpBinary)
Soup= pulp.LpVariable.dictsSoupdf.StudentID, pulp.LpBinary)
EnergyDrink= pulp.LpVariable.dicts("EnergyDrink", df.StudentID, pulp.LpBinary)


cost[id] = Protein_Can[id] * Protein_Can_Cost + Soup[id] * Soup_Cost + EnergyDrink[id] * EnergyDrink_Cost 

for id in df.StudentID:
    if df.loc[id]['ProteinDef'] == TRUE:
        prob +=  Soup[id] <= 0
    if df.loc[id]['CarbDef'] == TRUE:
        prob +=  Protein_Can[id] <= 0

Is there approach possible in which I can make sure that variable is defined based on excel input automatically? Not Hard coded way? Can such code be made dynamic?


Solution

  • Yes!

    You need to make better use of sets. If you have a text on linear programming or look through some tutorials, you should be able to find many examples. You should not be hard coding the data into your formulation directly as you are. It is fine to either read the data from a file or develop it separately before your math model, but not in the formulation of constraints & such.

    Your example above has 2 main sets, each of those has two subsets. (below in pseudo-code)

    main sets

    Foods (or "allotment choices"):

    Foods = {protein_can, soup, etc...}
    

    You could use those string names as the set members or just use an index and keep track of the names in an indexed list separate from the problem

    Students:

    Students = {A01, A02, ... }
    

    Subsets

    Each of the sets above has 2 subsets, the association with protein and carbs. So for example for Foods:

    Foods_P = {Protein_can, energy_drink}  
    Foods_C = ...
    

    Those should be very easy to set up when you read in data, and using them will make your program cleaner and faster.

    Using the sets...

    After you have that, you will only need 2 variables to work with the foods:

    avail[food]
    cost[food]
    

    where food is a member of the set Foods

    Your model appears to be an assignment model where you are assigning some quantity of a particular food to a particular student, so you should have some decision variable that is double-indexed:

    x[f, s]  # an integer/real number representing the assignment of x quantity of food f to student s
    

    Similar constructs for the constraints and cost should fall into place...

    Reading in from Excel.

    You have a couple options here...

    1. Use pandas to read in the info, which is probably overkill and not necessary
    2. Use a direct Excel interface, like openpyxl, which again is overkill.
    3. After your Excel file is set up, save each tab as a separate .csv file (standard for data) and then let python read the csv's which is straightforward. (Use "save as" function in Excel and select .csv ... ignore Bill Gates' warnings)

    You should produce two csvs, one for food, one for students. Then they can be read as such:

    # script to import csv files
    
    file_name = 'food.csv'
    
    qty_dict = {}
    cost_dict = {}
    foods = set()
    food_p = set()
    food_c = set()
    
    with open(file_name, 'r') as fin:
        fin.readline()  # read the header and discard it
        # process the rest
        for line in fin:
            data = line.strip().split(',')
            key = data[0]           # the type name
            qty = int(data[1])      # int conversion of qty
            cost = int(data[2])     # int conversion of cost
            qty_dict[key] = qty
            cost_dict[key] = cost
    
            # test the booleans, construct the sets
            if data[3].lower() == 'true':
                food_p.add(key)
            if data[4].lower() == 'true':
                food_c.add(key)
    
    print (cost_dict)
    print (food_p)
    
    
    # do same for students...
    
    # make pulp model
    
    # you probably have a decision variable X[f,s] which is probably something like:
    
    X = pulp.LpVariable.dicts("assign", [(f, s) for f in foods for s in students], cat='Binary')
    

    I don't have pulp installed, so the last line is accurate, but not tested.

    You should be able to set up your constraints and the objective with the dictionaries and sets that were created during read-in.

    Snapshot of food in Excel:

    enter image description here