Search code examples
pythoncombinationsncr

Python: Find specific and distinct set of combinations


I have 4 departments:

# department 1, 2, 3, 4
departments = ['d1','d2','d3','d4']

Each department has 8 units:

d1 = [1,2,3,4,5,6,7,8]
d2 = [1,2,3,4,5,6,7,8]
d3 = [1,2,3,4,5,6,7,8]
d4 = [1,2,3,4,5,6,7,8]

I need to create working groups.

A working group is comprised of 2 units from each department, where there is only one individual from any given numbered unit. 8 total units, 2 per department, must contain 1 - 8 with no duplicate unit numbers.

# Example Valid:
working_group_1 = ['d11', 'd12', 'd23', 'd24', 'd35', 'd36', 'd47', 'd48']
working_gorup_2 = ['d18', 'd17', 'd26', 'd25', 'd34', 'd33', 'd42', 'd41']
working_gorup_3 = ['d14', 'd18', 'd22', 'd26', 'd31', 'd35', 'd43', 'd47']
# each working group has 8 total individuals
# each working group has 2 individuals from each group
# each working group has 1 individual from each section 1-8

# Example Invalid:
working_group_4 = ['d12', 'd11', 'd23', 'd24', 'd35', 'd36', 'd47', 'd48']

# wg_4 is invalid because it is a copy of wg_1, where individuals from section 1, 2 are both assigned from the same department, just in a different order.
# This is not what I looking for, but might be close-ish
# The following snippet will give me all permutations
# e.g., ('d11', 'd12', 'd24', 'd27', 'd32', 'd34', 'd41', 'd46')
# The problem with this example is that, there are two individuals from unit 1
# - d11, d41 <- this is not allowed

import itertools

# Make the groups
groups = [[f'd{i+1}{j+1}' for j in range(8)] for i in range(4)]
lists = [list(itertools.combinations(groups[i],2)) for i in range(len(groups))]

#Generate list of all combinations, selecting 2 units from each department
all_combinations = [] 
for i in itertools.product(lists[0],lists[1],lists[2],lists[3]):
    all_combinations.append(i[0]+i[1]+i[2]+i[3])

#Output all combinations
print(all_combinations)

Any help is appreciated, thanks in advance.


Solution

  • You were close with your approach. This seems a typical backtracking problem, where for every taken set of units, you have to reduce the unit-pool. Here is a recursive generator function:

    def backtrack_groups(deps, units_per=2, used_units=None):
        if not deps:  # base case: no mo' departments
            yield []
            return
        used_units = used_units or set()
        poolsize = len(used_units) + units_per * len(deps)
        pool = [u for u in range(1, poolsize+1) if u not in used_units]
        # choose all possible unit combinations for the first (remaining) department ...
        dep, *rest = deps
        for units in combinations(pool, units_per):
            used_units.update(units)
            for group in backtrack_groups(rest, units_per, used_units):
                # ... and combine it with every possible group from
                # the remaining departments (honoring the used_units constraint)
                yield [f"{dep}{u}" for u in units] + group
            # as you have exhausted these two units for the current department
            # you put them back in the pool
            used_units.difference_update(units)
    
    >>> groups = backtrack_groups(departments)
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd35', 'd36', 'd47', 'd48']
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd35', 'd37', 'd46', 'd48']
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd35', 'd38', 'd46', 'd47']
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd36', 'd37', 'd45', 'd48']
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd36', 'd38', 'd45', 'd47']
    >>> next(groups)
    ['d11', 'd12', 'd23', 'd24', 'd37', 'd38', 'd45', 'd46']
    # ...
    

    And, general case:

    >>> list(backtrack_groups(["d1", "d2"], 1))
    [['d11', 'd22'], 
     ['d12', 'd21']]
    >>> list(backtrack_groups(["d1", "d2"], 2))
    [['d11', 'd12', 'd23', 'd24'], 
     ['d11', 'd13', 'd22', 'd24'], 
     ['d11', 'd14', 'd22', 'd23'], 
     ['d12', 'd13', 'd21', 'd24'], 
     ['d12', 'd14', 'd21', 'd23'], 
     ['d13', 'd14', 'd21', 'd22']]
    

    len(list(groups)) (with a new generator) is 2520 which is exactly nCr(8, 2) * nCr(6, 2) * nCr(4, 2) * nCr(2, 2) with nCr being n-choose-r or "n over r". So the combinatoric numbers work out ;-)