Search code examples
pythonalgorithmsumcombinationssubset-sum

Find all combinations that add up to given number python with list of lists


I've seen plenty of threads on how to find all combinations that add up to a number with one list, but wanted to know how to expand this such that you can only pick one number at a time, from a list of lists

Question:
You must select 1 number from each list, how do you find all combinations that sum to N?

Given:
3 lists of differing fixed lengths [e.g. l1 will always have 6 values, l2 will always have 10 values, etc]:

l1 = [0.013,0.014,0.015,0.016,0.017,0.018]
l2 = [0.0396,0.0408,0.042,0.0432,0.0444,0.045,0.0468,0.048,0.0492,0.0504]
l3 = [0.0396,0.0408]

Desired Output:
If N = .0954 then the output is [0.015, 0.396, 0.408],[0.015, 0.408, 0.0396].

What I have tried:

output = sum(list(product(l1,l2,l3,l4,l5,l6,l7,l8)))

However this is too intensive as my largest bucket has 34 values, creating too many combinations.

Any help/tips on how to approach this in a more efficient manner would be greatly appreciated!


Solution

  • Non-recursive solution:

    from itertools import accumulate, product
    from sys import float_info
    
    def test(lists, target):
        # will return a list of 2-tuples, containing sum and elements making it
        convolutions = [(0,())]
        # lower_bounds[i] - what is the least gain we'll get from remaining lists
        lower_bounds = list(accumulate(map(min, lists[::-1])))[::-1][1:] + [0]
        # upper_bounds[i] - what is the max gain we'll get from remaining lists
        upper_bounds = list(accumulate(map(max, lists[::-1])))[::-1][1:] + [0]
        for l, lower_bound, upper_bound in zip(lists, lower_bounds, upper_bounds):
            convolutions = [
                # update sum and extend the list for viable candidates
                (accumulated + new_element, elements + (new_element,))
                for (accumulated, elements), new_element in product(convolutions, l)
                if lower_bound - float_info.epsilon <= target - accumulated - new_element <= upper_bound +  float_info.epsilon
            ]
    
        return convolutions
    

    Output of test(lists, target):

    [(0.09540000000000001, (0.015, 0.0396, 0.0408)),
     (0.09540000000000001, (0.015, 0.0408, 0.0396))]
    

    This can be further optimized by sorting lists and slicing them based on upper/lower bound using bisect:

    from bisect import bisect_left, bisect_right
    # ...
    
    convolutions = [
        (partial_sum + new_element, partial_elements + (new_element,))
        for partial_sum, partial_elements in convolutions
        for new_element in l[bisect_left(l, target-upper_bound-partial_sum-float_info.epsilon):bisect_right(l, target-lower_bound-partial_sum+float_info.epsilon)]
    ]