Search code examples
pythonmathematical-optimizationor-toolsknapsack-problemcp-sat

Knapsack problem with a total item limit in Python


I'm trying to solve a knapsack problem with Python with extra requirements. I've found lots of knapsack code and math explanations but I can't find anything that quite fits what I'm trying to do. To be honest, the math information is way over my head, so that's why I'm asking here. I would be happy to learn to use any available library, as long as its free. Numpy, pandas, ortools, etc.

Let's say I have $20 and I want to buy 6 individual beers for a mix-and-match six pack container. Each beer has a name, a type, a price (knapsack-like "weight"), and a review score (knapsack-like "value").
I want to buy the highest combination of review scores while also following these rules:

  • Spend less than $20 total
  • Buy 6 beers exactly
  • I also want to be sure I always get 2 Lager, 1 Stout, and 1 Amber. The other 2 beers can be anything but all 6 beers have to be unique.

This is just a small list of data as an example. I know brute force might work here but my real data is much larger and my first attempts (before discovering this is a common problem) with brute force never finished, and would take way too long to find a solution. So I'm looking for something better than brute force.

capacity = 6
money = 20

beers = [ 
  {"name":"Beer1",  "type":"Lager",   "price":3.50, "score":4.1},
  {"name":"Beer2",  "type":"Porter",  "price":4.90, "score":4.5},
  {"name":"Beer3",  "type":"IPA",     "price":3.70, "score":4.0},
  {"name":"Beer4",  "type":"Stout",   "price":3.20, "score":4.2},
  {"name":"Beer5",  "type":"Amber",   "price":3.80, "score":3.9},
  {"name":"Beer6",  "type":"Stout",   "price":2.70, "score":2.9},
  {"name":"Beer7",  "type":"IPA",     "price":2.50, "score":3.2},
  {"name":"Beer8",  "type":"Pilsner", "price":3.10, "score":4.0},
  {"name":"Beer9",  "type":"Amber",   "price":3.00, "score":4.1},
  {"name":"Beer10", "type":"Porter",  "price":2.80, "score":3.3},
  {"name":"Beer11", "type":"IPA",     "price":3.70, "score":4.0},
  {"name":"Beer12", "type":"Lager",   "price":3.20, "score":4.2},
  {"name":"Beer13", "type":"Amber",   "price":3.30, "score":3.5},
  {"name":"Beer14", "type":"Stout",   "price":2.90, "score":2.8},
  {"name":"Beer15", "type":"Lager",   "price":3.20, "score":4.2},
]

The desired output would be a list of the names that were selected for the solution, like this: ["Beer12","Beer14","Beer13","Beer1","Beer7","Beer6"]

Thanks for looking. I hope we can find a solution, for me and for anyone else trying to solve a similar problem in the future.


Solution

  • It could look like the following example.

    We:

    • use cp-sat as solver
    • scale prices and scores to be able to use integral domains
      • cp-sat needs this

    Might contain a bug as i didn't check it much, but it's more about the general concepts anyway. It also indicates the modelling-power of the solver.

    Code

    from ortools.sat.python import cp_model
    
    # DATA
    capacity = 6
    money = 20
    beers = [ 
      {"name":"Beer1",  "type":"Lager",   "price":3.50, "score":4.1},
      {"name":"Beer2",  "type":"Porter",  "price":4.90, "score":4.5},
      {"name":"Beer3",  "type":"IPA",     "price":3.70, "score":4.0},
      {"name":"Beer4",  "type":"Stout",   "price":3.20, "score":4.2},
      {"name":"Beer5",  "type":"Amber",   "price":3.80, "score":3.9},
      {"name":"Beer6",  "type":"Stout",   "price":2.70, "score":2.9},
      {"name":"Beer7",  "type":"IPA",     "price":2.50, "score":3.2},
      {"name":"Beer8",  "type":"Pilsner", "price":3.10, "score":4.0},
      {"name":"Beer9",  "type":"Amber",   "price":3.00, "score":4.1},
      {"name":"Beer10", "type":"Porter",  "price":2.80, "score":3.3},
      {"name":"Beer11", "type":"IPA",     "price":3.70, "score":4.0},
      {"name":"Beer12", "type":"Lager",   "price":3.20, "score":4.2},
      {"name":"Beer13", "type":"Amber",   "price":3.30, "score":3.5},
      {"name":"Beer14", "type":"Stout",   "price":2.90, "score":2.8},
      {"name":"Beer15", "type":"Lager",   "price":3.20, "score":4.2},]
    
    # PREPROCESSING
    n_beers = len(set([entry['name'] for entry in beers]))
    
    # MODEL
    model = cp_model.CpModel()
    x_select = [model.NewBoolVar('') for i in range(n_beers)]
    
    # select exactly "capacity"
    model.Add(sum(x_select) == capacity)
    
    # spend not too much -> # ASSUMPTION: * 100 makes all the values integral
    model.Add(sum([x_select[i] * int(round(beers[i]['price']*100)) for i in range(n_beers)]) <= money * 100)
    
    # >= 2 lagers needed
    model.Add(sum([x_select[i] for i in range(n_beers) if beers[i]['type'] == 'Lager']) >= 2)
    
    # >= 1 Stout needed
    model.Add(sum([x_select[i] for i in range(n_beers) if beers[i]['type'] == 'Stout']) >= 1)
    
    # >= 1 Amber needed
    model.Add(sum([x_select[i] for i in range(n_beers) if beers[i]['type'] == 'Amber']) >= 1)
    
    # maximize sum of scores selected -> # ASSUMPTION: * 10 makes all the values integral
    model.Maximize(sum([x_select[i] * int(round(beers[i]['score']*10)) for i in range(n_beers)]))
    
    # SOLVE
    solver = cp_model.CpSolver()
    solver.parameters.log_search_progress = True
    model.Proto().objective.scaling_factor = -1./10         # inverse scaling for solver logging output
    status = solver.Solve(model)
    
    if status == cp_model.OPTIMAL:
      selected = [i for i in range(n_beers) if solver.Value(x_select[i]) == 1]
      print("\n".join([str(beers[i]) for i in selected]))
    

    Trimmed Output

    status: OPTIMAL
    objective: 24.8
    best_bound: 24.8
    booleans: 8
    conflicts: 0
    branches: 19
    propagations: 20
    integer_propagations: 42
    restarts: 17
    lp_iterations: 4
    walltime: 0.0689822
    usertime: 0.0689823
    deterministic_time: 2.03413e-05
    primal_integral: 1.69201e-05
    Total cuts added: 3 (out of 4 calls) worker: ''
      - num simplifications: 0
      - added 1 cut of type 'CG'.
      - added 1 cut of type 'MIR_1'.
      - added 1 cut of type 'MIR_2'.
    {'name': 'Beer1', 'type': 'Lager', 'price': 3.5, 'score': 4.1}
    {'name': 'Beer4', 'type': 'Stout', 'price': 3.2, 'score': 4.2}
    {'name': 'Beer8', 'type': 'Pilsner', 'price': 3.1, 'score': 4.0}
    {'name': 'Beer9', 'type': 'Amber', 'price': 3.0, 'score': 4.1}
    {'name': 'Beer12', 'type': 'Lager', 'price': 3.2, 'score': 4.2}
    {'name': 'Beer15', 'type': 'Lager', 'price': 3.2, 'score': 4.2}