Let's say I have four elements like this
test = ['A', 'B', 'C', 'D']
Then I can easily get all combinations consisting of three elements like this
import itertools as it
list(it.combinations(test, 3))
which gives
[('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'C', 'D'), ('B', 'C', 'D')]
However, now I have also several suffixes, e.g.
suff = ['x', 'y']
which I want to take into consideration. My desired outcome looks like this
[('A__x', 'B__x', 'C__x'),
('A__x', 'B__x', 'C__y'),
('A__x', 'B__x', 'D__x'),
('A__x', 'B__x', 'D__y'),
('A__x', 'B__y', 'C__x'),
('A__x', 'B__y', 'C__y'),
('A__x', 'B__y', 'D__x'),
('A__x', 'B__y', 'D__y'),
('A__x', 'C__x', 'D__x'),
('A__x', 'C__x', 'D__y'),
('A__x', 'C__y', 'D__x'),
('A__x', 'C__y', 'D__y'),
('A__y', 'B__x', 'C__x'),
('A__y', 'B__x', 'C__y'),
('A__y', 'B__x', 'D__x'),
('A__y', 'B__x', 'D__y'),
('A__y', 'B__y', 'C__x'),
('A__y', 'B__y', 'C__y'),
('A__y', 'B__y', 'D__x'),
('A__y', 'B__y', 'D__y'),
('A__y', 'C__x', 'D__x'),
('A__y', 'C__x', 'D__y'),
('A__y', 'C__y', 'D__x'),
('A__y', 'C__y', 'D__y'),
('B__x', 'C__x', 'D__x'),
('B__x', 'C__x', 'D__y'),
('B__x', 'C__y', 'D__x'),
('B__x', 'C__y', 'D__y'),
('B__y', 'C__x', 'D__x'),
('B__y', 'C__x', 'D__y'),
('B__y', 'C__y', 'D__x'),
('B__y', 'C__y', 'D__y')]
So I want to find all combinations of three for both suffixes but without a duplicate in everything in front of this suffix (e.g. I don't want ('A__x', 'A__y', 'C__x')
as there would be two A
).
I currently implement this as follows:
import itertools as it
def filter_elements(some_elements, connector):
# get all elements in front of the connector
all_elements = [eli.split(connector)[0] for eli in some_elements]
# return True if all elements are unique
return len(all_elements) == len(set(all_elements))
test = ['A', 'B', 'C', 'D']
suff = ['x', 'y']
connector = '__'
test_suff = ["{}{}{}".format(eli, connector, si) for eli in test for si in suff]
all_combinations = list(it.combinations(test_suff, 3))
desired_combinations = [combi for combi in all_combinations if filter_elements(combi, connector)]
which gives me the desired output.
Clearly, this should not only work for four elements and 2 suffixes but for arbitrary combinations. Is there a more straightforward way of achieving this which avoids the creation of all_combinations
?
I think that you only need it.product
:
>>> list(it.product('xy', repeat=3))
[('x', 'x', 'x'),
('x', 'x', 'y'),
('x', 'y', 'x'),
('x', 'y', 'y'),
('y', 'x', 'x'),
('y', 'x', 'y'),
('y', 'y', 'x'),
('y', 'y', 'y')]
So if you want the 32 tuples listed above:
list(it.product(it.combinations(test, 3), it.product(suff,repeat=3)))
It returns:
[(('A', 'B', 'C'), ('x', 'x', 'x')),
(('A', 'B', 'C'), ('x', 'x', 'y')),
(('A', 'B', 'C'), ('x', 'y', 'x')),
(('A', 'B', 'C'), ('x', 'y', 'y')),
(('A', 'B', 'C'), ('y', 'x', 'x')),
(('A', 'B', 'C'), ('y', 'x', 'y')),
(('A', 'B', 'C'), ('y', 'y', 'x')),
(('A', 'B', 'C'), ('y', 'y', 'y')),
(('A', 'B', 'D'), ('x', 'x', 'x')),
(('A', 'B', 'D'), ('x', 'x', 'y')),
(('A', 'B', 'D'), ('x', 'y', 'x')),
(('A', 'B', 'D'), ('x', 'y', 'y')),
(('A', 'B', 'D'), ('y', 'x', 'x')),
(('A', 'B', 'D'), ('y', 'x', 'y')),
(('A', 'B', 'D'), ('y', 'y', 'x')),
(('A', 'B', 'D'), ('y', 'y', 'y')),
(('A', 'C', 'D'), ('x', 'x', 'x')),
(('A', 'C', 'D'), ('x', 'x', 'y')),
(('A', 'C', 'D'), ('x', 'y', 'x')),
(('A', 'C', 'D'), ('x', 'y', 'y')),
(('A', 'C', 'D'), ('y', 'x', 'x')),
(('A', 'C', 'D'), ('y', 'x', 'y')),
(('A', 'C', 'D'), ('y', 'y', 'x')),
(('A', 'C', 'D'), ('y', 'y', 'y')),
(('B', 'C', 'D'), ('x', 'x', 'x')),
(('B', 'C', 'D'), ('x', 'x', 'y')),
(('B', 'C', 'D'), ('x', 'y', 'x')),
(('B', 'C', 'D'), ('x', 'y', 'y')),
(('B', 'C', 'D'), ('y', 'x', 'x')),
(('B', 'C', 'D'), ('y', 'x', 'y')),
(('B', 'C', 'D'), ('y', 'y', 'x')),
(('B', 'C', 'D'), ('y', 'y', 'y'))]
In order to get the desired format, you need to use zip
and join
:
[tuple('__'.join(ls) for ls in zip(*t)) for t in it.product(it.combinations(test, 3), it.product(suff,repeat=3))]
It returns:
[('A__x', 'B__x', 'C__x'),
('A__x', 'B__x', 'C__y'),
('A__x', 'B__y', 'C__x'),
('A__x', 'B__y', 'C__y'),
('A__y', 'B__x', 'C__x'),
('A__y', 'B__x', 'C__y'),
('A__y', 'B__y', 'C__x'),
('A__y', 'B__y', 'C__y'),
('A__x', 'B__x', 'D__x'),
('A__x', 'B__x', 'D__y'),
('A__x', 'B__y', 'D__x'),
('A__x', 'B__y', 'D__y'),
('A__y', 'B__x', 'D__x'),
('A__y', 'B__x', 'D__y'),
('A__y', 'B__y', 'D__x'),
('A__y', 'B__y', 'D__y'),
('A__x', 'C__x', 'D__x'),
('A__x', 'C__x', 'D__y'),
('A__x', 'C__y', 'D__x'),
('A__x', 'C__y', 'D__y'),
('A__y', 'C__x', 'D__x'),
('A__y', 'C__x', 'D__y'),
('A__y', 'C__y', 'D__x'),
('A__y', 'C__y', 'D__y'),
('B__x', 'C__x', 'D__x'),
('B__x', 'C__x', 'D__y'),
('B__x', 'C__y', 'D__x'),
('B__x', 'C__y', 'D__y'),
('B__y', 'C__x', 'D__x'),
('B__y', 'C__x', 'D__y'),
('B__y', 'C__y', 'D__x'),
('B__y', 'C__y', 'D__y')]