Search code examples
pythonalgorithmcombinationscombinatorics

How can I generate all unique nested 2-tuples (nested pairings) of a set of n objects in Python?


By nested 2-tuples, I mean something like this: ((a,b),(c,(d,e))) where all tuples have two elements. I don't need different orderings of the elements, just the different ways of putting parentheses around them. For items = [a, b, c, d], there are 5 unique pairings, which are:

(((a,b),c),d)  
((a,(b,c)),d)  
(a,((b,c),d))  
(a,(b,(c,d)))  
((a,b),(c,d))

In a perfect world I'd also like to have control over the maximum depth of the returned tuples, so that if I generated all pairings of items = [a, b, c, d] with max_depth=2, it would only return ((a,b),(c,d)).

This problem turned up because I wanted to find a way to generate the results of addition on non-commutative, non-associative numbers. If a+b doesn't equal b+a, and a+(b+c) doesn't equal (a+b)+c, what are all the possible sums of a, b, and c?

I have made a function that generates all pairings, but it also returns duplicates.

import itertools

def all_pairings(items):
    if len(items) == 2:
        yield (*items,)
    else:
        for i, pair in enumerate(itertools.pairwise(items)):
            for pairing in all_pairings(items[:i] + [pair] + items[i+2:]):
                yield pairing

For example, it returns ((a,b),(c,d)) twice for items=[a, b, c, d], since it pairs up (a,b) first in one case and (c,d) first in the second case.

Returning duplicates becomes a bigger and bigger problem for larger numbers of items. With duplicates, the number of pairings grows factorially, and without duplicates it grows exponentially, according to the Catalan Numbers (https://oeis.org/A000108).

n With duplicates: (n-1)! Without duplicates: (2(n-1))!/(n!(n-1)!)
1 1 1
2 1 1
3 2 2
4 6 5
5 24 14
6 120 42
7 720 132
8 5040 429
9 40320 1430
10 362880 4862

Because of this, I have been trying to come up with an algorithm that doesn't need to search through all the possibilities, only the unique ones. Again, it would also be nice to have control over the maximum depth, but that could probably be added to an existing algorithm. So far I've been unsuccessful in coming up with an approach, and I also haven't found any resources that cover this specific problem. I'd appreciate any help or links to helpful resources.


Solution

  • Using a recursive generator:

    items = ['a', 'b', 'c', 'd']
    
    def split(l):
        if len(l) == 1:
            yield l[0]
        for i in range(1, len(l)):
            for a in split(l[:i]):
                for b in split(l[i:]):
                    yield (a, b)
            
    list(split(items))
    

    Output:

    [('a', ('b', ('c', 'd'))),
     ('a', (('b', 'c'), 'd')),
     (('a', 'b'), ('c', 'd')),
     (('a', ('b', 'c')), 'd'),
     ((('a', 'b'), 'c'), 'd')]
    

    Check of uniqueness:

    assert len(list(split(list(range(10))))) == 4862
    

    Reversed order of the items:

    items = ['a', 'b', 'c', 'd']
    
    def split(l):
        if len(l) == 1:
            yield l[0]
        for i in range(len(l)-1, 0, -1):
            for a in split(l[:i]):
                for b in split(l[i:]):
                    yield (a, b)
            
    list(split(items))
    
    [((('a', 'b'), 'c'), 'd'),
     (('a', ('b', 'c')), 'd'),
     (('a', 'b'), ('c', 'd')),
     ('a', (('b', 'c'), 'd')),
     ('a', ('b', ('c', 'd')))]
    

    With maxdepth:

    items = ['a', 'b', 'c', 'd']
    
    def split(l, maxdepth=None):
        if len(l) == 1:
            yield l[0]
        elif maxdepth is not None and maxdepth <= 0:
            yield tuple(l)
        else:
            for i in range(1, len(l)):
                for a in split(l[:i], maxdepth=maxdepth and maxdepth-1):
                    for b in split(l[i:], maxdepth=maxdepth and maxdepth-1):
                        yield (a, b)
    
    list(split(items))
    # or
    list(split(items, maxdepth=3))
    # or
    list(split(items, maxdepth=2))
    
    [('a', ('b', ('c', 'd'))),
     ('a', (('b', 'c'), 'd')),
     (('a', 'b'), ('c', 'd')),
     (('a', ('b', 'c')), 'd'),
     ((('a', 'b'), 'c'), 'd')]
    
    list(split(items, maxdepth=1))
    
    [('a', ('b', 'c', 'd')),
     (('a', 'b'), ('c', 'd')),
     (('a', 'b', 'c'), 'd')]
    
    list(split(items, maxdepth=0))
    
    [('a', 'b', 'c', 'd')]