I want to efficiently generate pairs of elements from two lists equal to their Cartesian product with some elements omitted. The elements in each list are unique.
The code below does exactly what's needed but I'm looking to optimize it perhaps by replacing the loop.
See the comments in the code for details.
Any advice would be appreciated.
from itertools import product
from pprint import pprint as pp
def pairs(list1, list2):
""" Return all combinations (x,y) from list1 and list2 except:
1. Omit combinations (x,y) where x==y """
tuples = filter(lambda t: t[0] != t[1], product(list1,list2))
""" 2. Include only one of the combinations (x,y) and (y,x) """
result = []
for t in tuples:
if not (t[1], t[0]) in result:
result.append(t)
return result
list1 = ['A', 'B', 'C']
list2 = ['A', 'D', 'E']
pp(pairs(list1, list1)) # Test a list with itself
pp(pairs(list1, list2)) # Test two lists with some common elements
Output
[('A', 'B'), ('A', 'C'), ('B', 'C')]
[('A', 'D'),
('A', 'E'),
('B', 'A'),
('B', 'D'),
('B', 'E'),
('C', 'A'),
('C', 'D'),
('C', 'E')]
About 5-6 times faster than the fastest in your answer's benchmark. I build sets of values that appear in both lists or just one, and combine them appropriately without further filtering.
from itertools import product, combinations
def pairs(list1, list2):
a = {*list1}
b = {*list2}
ab = a & b
return [
*product(a, b-a),
*product(a-b, ab),
*combinations(ab, 2)
]
You could also make it an iterator (because unlike previous solutions, I don't need to store the already produced pairs to filter further ones):
from itertools import product, combinations, chain
def pairs(list1, list2):
a = {*list1}
b = {*list2}
ab = a & b
return chain(
product(a, b-a),
product(a-b, ab),
combinations(ab, 2)
)