Search code examples
pythonperformancecombinationspython-itertools

How to get all combinations of elements ignoring a suffix?


Let's say I have four elements like this

test = ['A', 'B', 'C', 'D']

Then I can easily get all combinations consisting of three elements like this

import itertools as it
list(it.combinations(test, 3))

which gives

 [('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'C', 'D'), ('B', 'C', 'D')]

However, now I have also several suffixes, e.g.

suff = ['x', 'y']

which I want to take into consideration. My desired outcome looks like this

[('A__x', 'B__x', 'C__x'),
 ('A__x', 'B__x', 'C__y'),
 ('A__x', 'B__x', 'D__x'),
 ('A__x', 'B__x', 'D__y'),
 ('A__x', 'B__y', 'C__x'),
 ('A__x', 'B__y', 'C__y'),
 ('A__x', 'B__y', 'D__x'),
 ('A__x', 'B__y', 'D__y'),
 ('A__x', 'C__x', 'D__x'),
 ('A__x', 'C__x', 'D__y'),
 ('A__x', 'C__y', 'D__x'),
 ('A__x', 'C__y', 'D__y'),
 ('A__y', 'B__x', 'C__x'),
 ('A__y', 'B__x', 'C__y'),
 ('A__y', 'B__x', 'D__x'),
 ('A__y', 'B__x', 'D__y'),
 ('A__y', 'B__y', 'C__x'),
 ('A__y', 'B__y', 'C__y'),
 ('A__y', 'B__y', 'D__x'),
 ('A__y', 'B__y', 'D__y'),
 ('A__y', 'C__x', 'D__x'),
 ('A__y', 'C__x', 'D__y'),
 ('A__y', 'C__y', 'D__x'),
 ('A__y', 'C__y', 'D__y'),
 ('B__x', 'C__x', 'D__x'),
 ('B__x', 'C__x', 'D__y'),
 ('B__x', 'C__y', 'D__x'),
 ('B__x', 'C__y', 'D__y'),
 ('B__y', 'C__x', 'D__x'),
 ('B__y', 'C__x', 'D__y'),
 ('B__y', 'C__y', 'D__x'),
 ('B__y', 'C__y', 'D__y')]

So I want to find all combinations of three for both suffixes but without a duplicate in everything in front of this suffix (e.g. I don't want ('A__x', 'A__y', 'C__x') as there would be two A).

I currently implement this as follows:

import itertools as it

def filter_elements(some_elements, connector):

    # get all elements in front of the connector
    all_elements = [eli.split(connector)[0] for eli in some_elements]

    # return True if all elements are unique
    return len(all_elements) == len(set(all_elements))

test = ['A', 'B', 'C', 'D']

suff = ['x', 'y']

connector = '__'

test_suff = ["{}{}{}".format(eli, connector, si) for eli in test for si in suff]

all_combinations = list(it.combinations(test_suff, 3))

desired_combinations = [combi for combi in all_combinations if filter_elements(combi, connector)]

which gives me the desired output.

Clearly, this should not only work for four elements and 2 suffixes but for arbitrary combinations. Is there a more straightforward way of achieving this which avoids the creation of all_combinations?


Solution

  • I think that you only need it.product:

    >>> list(it.product('xy', repeat=3))
    [('x', 'x', 'x'),
     ('x', 'x', 'y'),
     ('x', 'y', 'x'),
     ('x', 'y', 'y'),
     ('y', 'x', 'x'),
     ('y', 'x', 'y'),
     ('y', 'y', 'x'),
     ('y', 'y', 'y')]
    

    So if you want the 32 tuples listed above:

    list(it.product(it.combinations(test, 3), it.product(suff,repeat=3)))
    

    It returns:

    [(('A', 'B', 'C'), ('x', 'x', 'x')),
     (('A', 'B', 'C'), ('x', 'x', 'y')),
     (('A', 'B', 'C'), ('x', 'y', 'x')),
     (('A', 'B', 'C'), ('x', 'y', 'y')),
     (('A', 'B', 'C'), ('y', 'x', 'x')),
     (('A', 'B', 'C'), ('y', 'x', 'y')),
     (('A', 'B', 'C'), ('y', 'y', 'x')),
     (('A', 'B', 'C'), ('y', 'y', 'y')),
     (('A', 'B', 'D'), ('x', 'x', 'x')),
     (('A', 'B', 'D'), ('x', 'x', 'y')),
     (('A', 'B', 'D'), ('x', 'y', 'x')),
     (('A', 'B', 'D'), ('x', 'y', 'y')),
     (('A', 'B', 'D'), ('y', 'x', 'x')),
     (('A', 'B', 'D'), ('y', 'x', 'y')),
     (('A', 'B', 'D'), ('y', 'y', 'x')),
     (('A', 'B', 'D'), ('y', 'y', 'y')),
     (('A', 'C', 'D'), ('x', 'x', 'x')),
     (('A', 'C', 'D'), ('x', 'x', 'y')),
     (('A', 'C', 'D'), ('x', 'y', 'x')),
     (('A', 'C', 'D'), ('x', 'y', 'y')),
     (('A', 'C', 'D'), ('y', 'x', 'x')),
     (('A', 'C', 'D'), ('y', 'x', 'y')),
     (('A', 'C', 'D'), ('y', 'y', 'x')),
     (('A', 'C', 'D'), ('y', 'y', 'y')),
     (('B', 'C', 'D'), ('x', 'x', 'x')),
     (('B', 'C', 'D'), ('x', 'x', 'y')),
     (('B', 'C', 'D'), ('x', 'y', 'x')),
     (('B', 'C', 'D'), ('x', 'y', 'y')),
     (('B', 'C', 'D'), ('y', 'x', 'x')),
     (('B', 'C', 'D'), ('y', 'x', 'y')),
     (('B', 'C', 'D'), ('y', 'y', 'x')),
     (('B', 'C', 'D'), ('y', 'y', 'y'))]
    

    In order to get the desired format, you need to use zip and join:

    [tuple('__'.join(ls) for ls in zip(*t)) for t in it.product(it.combinations(test, 3), it.product(suff,repeat=3))]
    

    It returns:

    [('A__x', 'B__x', 'C__x'),
     ('A__x', 'B__x', 'C__y'),
     ('A__x', 'B__y', 'C__x'),
     ('A__x', 'B__y', 'C__y'),
     ('A__y', 'B__x', 'C__x'),
     ('A__y', 'B__x', 'C__y'),
     ('A__y', 'B__y', 'C__x'),
     ('A__y', 'B__y', 'C__y'),
     ('A__x', 'B__x', 'D__x'),
     ('A__x', 'B__x', 'D__y'),
     ('A__x', 'B__y', 'D__x'),
     ('A__x', 'B__y', 'D__y'),
     ('A__y', 'B__x', 'D__x'),
     ('A__y', 'B__x', 'D__y'),
     ('A__y', 'B__y', 'D__x'),
     ('A__y', 'B__y', 'D__y'),
     ('A__x', 'C__x', 'D__x'),
     ('A__x', 'C__x', 'D__y'),
     ('A__x', 'C__y', 'D__x'),
     ('A__x', 'C__y', 'D__y'),
     ('A__y', 'C__x', 'D__x'),
     ('A__y', 'C__x', 'D__y'),
     ('A__y', 'C__y', 'D__x'),
     ('A__y', 'C__y', 'D__y'),
     ('B__x', 'C__x', 'D__x'),
     ('B__x', 'C__x', 'D__y'),
     ('B__x', 'C__y', 'D__x'),
     ('B__x', 'C__y', 'D__y'),
     ('B__y', 'C__x', 'D__x'),
     ('B__y', 'C__x', 'D__y'),
     ('B__y', 'C__y', 'D__x'),
     ('B__y', 'C__y', 'D__y')]