Search code examples
pythonoptimizationdynamic-programmingknapsack-problem

Knapsack Problem: Find Top-K Lower Profit Solutions


In the classic 0-1 knapsack problem, I am using the following (dynamic programming) algorithm to construct a "dp table":

def knapsack(weights, values, capacity):
    n = len(weights)
    weights = np.concatenate(([0], weights))  # Prepend a zero
    values = np.concatenate(([0], values))    # Prepend a zero
    
    table = np.zeros((n+1, capacity+1), dtype=np.int64)  # The first row/column are zeros
    for i in range(n+1):
        for w in range(capacity+1):
            if i == 0 or w == 0:
                table[i, w] = 0
            elif weights[i] <= w:
                table[i, w] = max(
                    table[i-1, w-weights[i]] + values[i],
                    table[i-1, w]
                )
            else:
                table[i, w] = table[i-1, w]
    return table

Then, by traversing the table using the following code, I am then able identify the items that make up the optimal solution:

def get_items(weights, capacity, table):
    items = []
    i = len(weights)
    j = capacity
    weights = np.concatenate(([0], weights))  # Prepend a zero
    table_copy = table.copy()
    while i > 0 and j > 0:
        if table_copy[i, j] == table_copy[i-1, j]:
            pass  # Item is excluded
        else:
            items.append(i-1)  # Item is included, fix shifted index due to prepending zero
            j = j - weights[i]
        i = i-1
    return items

This is great for finding the items that make up the single optimal solution (i.e., highest total summed value). However, I can't seem to figure out how to retrieve, say, the top-3 or top-5 solutions from this table that has a total weight that is less than or equal to the maximum capacity.

For example, the top-3 solutions for the following input would be:

weights = [1,  2,  3, 2, 2]
values =  [6, 10, 12, 6, 5]
capacity = 5

# with corresponding "dp table"
array([[ 0,  0,  0,  0,  0,  0],
       [ 0,  6,  6,  6,  6,  6],
       [ 0,  6, 10, 16, 16, 16],
       [ 0,  6, 10, 16, 18, 22],
       [ 0,  6, 10, 16, 18, 22]])
Total Summed Value Items (Zero-based Index)
22 1, 2
22 0, 1, 3
21 0, 1, 4

Note that there is a tie for both first place and so we'd truncate the solutions after the first 3 rows (though, getting all ties is preferred if possible). Is there an efficient way to obtain the top-k solutions from the "dp table"?


Solution

  • I'd build a richer DP table. For each cell, you store just the total value of just one item set. I'd instead store information of the top k item sets for that cell: both the total value and the indices. Output of the below program:

    22 [0, 1, 3]
    22 [1, 2]
    21 [0, 1, 4]
    

    And that's the information I store in the final table cell table[-1][-1] instead of just your 22. More precisely, this is the Python data I store there:

    [(22, [1, 2]), (22, [0, 1, 3]), (21, [0, 1, 4])]
    

    If you need it more efficient, I think you could store only the last index of each item set instead of all its indices. But then you have to reconstruct the item sets later, and I'm not in the mood to do that :-)

    Full code:

    import numpy as np
    from operator import itemgetter
    
    def knapsack(weights, values, capacity, k):
        n = len(weights)
        weights = np.concatenate(([0], weights))  # Prepend a zero
        values = np.concatenate(([0], values))    # Prepend a zero
        get_total = itemgetter(0)
    
        table = np.zeros((n+1, capacity+1), dtype=object)  # The first row/column are zeros
        for i in range(n+1):
            for w in range(capacity+1):
                if i == 0 or w == 0:
                    table[i, w] = [(0, [])]
                elif weights[i] <= w:
                    table[i, w] = sorted(
                        [
                            (total + values[i], indices + [i-1])
                            for total, indices in table[i-1, w-weights[i]]
                        ] +
                        table[i-1, w],
                        key=get_total,
                        reverse=True
                    )[:k]
                else:
                    table[i, w] = table[i-1, w]
        return table[-1][-1]
    
    weights = [1,  2,  3, 2, 2]
    values =  [6, 10, 12, 6, 5]
    capacity = 5
    k = 3
    
    top_k = knapsack(weights, values, capacity, k)
    
    for total, indices in top_k:
        print(total, indices)
    

    Attempt This Online!