Search code examples
pythoncombinationspython-itertools

Filtered product of lists without repetitions


I need an efficient solution to find all combinations without repetitions of a given lists. It has to work like this:

l1 = [1, 2, 3]
l2 = [3, 4, 5]
combinations(l1, l2) = [(2, 4), (3, 4), (1, 5), (1, 4), (2, 3), (2, 5), (1, 3), (3, 5)]

The important thing is that I could use it for an arbitrary number of lists and in a very efficient way (preferably using itertools).

My current implementation is way too slow:

import itertools

def combinations(*all_lists: tuple[list[int]]):
    lists_product = itertools.product(*all_lists)
    res = set()

    for x in lists_product:
        if len(set(x)) == len(all_lists):
            x = tuple(sorted(x))
            res.add(x)

    return list(res)

I would appreciate any help.


Solution

  • Here is a solution which is a little faster :

    def combinations_2(*all_lists: tuple[list[int]]):
        len_all_list = len(all_lists)
        lists_product = list(itertools.product(*all_lists))
        fset = set(map(frozenset,lists_product))
        list_filter = list(map(sorted, list(filter(lambda x: len(x) == len_all_list, fset))))
        res_2 = list(map(tuple, list_filter))
        return res_2
    

    As your result res is from a set() which is an unordered data structure. So I sorted the two outputs for comparing.

    import random
    
    values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 20, 25, 30]
    l1 = random.choices(values,k=100)
    l2 = random.choices(values,k=100)
    
    res = combinations(l1, l2)
    res_2 = combinations_2(l1, l2)
    
    res_temp = sorted(res, key=lambda x: (x[0], x[1]))
    res_2_temp = sorted(res_2, key=lambda x: (x[0], x[1]))
    
    print(res_temp == res_2_temp)
    

    Which gives the result

    True

    Then the running time :

    %timeit combinations(l1, l2)
    

    7.1 ms ± 437 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

    %timeit combinations_2(l1, l2)
    

    2.56 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)