Search code examples
pythonuniquepermutationpython-itertools

Given a string of the alphabet, return sets of three items, with each item being a unique set of two characters from the alphabet


I'm using Python 3.10. I have a string of the alphabet 'A-Z'. What I require is every set of three, two character permutations, all unique, so a letter can't appear in a list more than once.

So, ['ab', 'cd', 'ef'], ['ac', 'bd', 'gh'] are valid but ['ab', 'ba' 'cd'] is not.

I've attempted:

full_list = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
out = []
out.append(list(itertools.permutations(full_list, 8)))

To at least get an 8 character string, which I could then split up but the result number is just far too vast.


Solution

  • I think this'll do it:

    import itertools
    
    full_list = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    for chunk in (
        [sorted(s) for s in sorted({
            frozenset([
                ''.join(sorted(p[i*2:i*2+2])) for i in range(3)
            ]) 
            for p in itertools.permutations(c)
        })]
        for c in itertools.combinations(full_list, 6)
    ):
        for pairs in chunk:
            print(pairs)
    

    The approach is to break it into two parts:

    1. Get all the combinations of 6 unique letters (that's the itertools.combinations call)
    2. Get all the unique ways you can divide those 6 letters into 3 unordered pairs.

    The first chunk, for example, is all the sets of pairs you can make out of the letters ABCDEF:

    ['AB', 'CE', 'DF']
    ['AB', 'CD', 'EF']
    ['AE', 'BD', 'CF']
    ['AF', 'BC', 'DE']
    ['AF', 'BD', 'CE']
    ['AF', 'BE', 'CD']
    ['AB', 'CF', 'DE']
    ['AC', 'BD', 'EF']
    ['AE', 'BC', 'DF']
    ['AC', 'BE', 'DF']
    ['AD', 'BF', 'CE']
    ['AC', 'BF', 'DE']
    ['AD', 'BC', 'EF']
    ['AD', 'BE', 'CF']
    ['AE', 'BF', 'CD']
    

    followed by all the pairs you can make from ABCDEG:

    ['AD', 'BE', 'CG']
    ['AC', 'BG', 'DE']
    ['AD', 'BG', 'CE']
    ['AE', 'BG', 'CD']
    ['AB', 'CE', 'DG']
    ['AE', 'BD', 'CG']
    ['AC', 'BE', 'DG']
    ['AC', 'BD', 'EG']
    ['AB', 'CG', 'DE']
    ['AG', 'BC', 'DE']
    ['AG', 'BE', 'CD']
    ['AD', 'BC', 'EG']
    ['AG', 'BD', 'CE']
    ['AB', 'CD', 'EG']
    ['AE', 'BC', 'DG']
    

    and it continues from there. You'll end up with something like 3.5M sets of pairs.

    You can assign them all to a list, e.g.:

    out = [
        pairs for chunk in (
            [sorted(s) for s in sorted({
                frozenset([
                    ''.join(sorted(p[i*2:i*2+2])) for i in range(3)
                ]) 
                for p in itertools.permutations(c)
            })]
            for c in itertools.combinations(full_list, 6)
        ) for pairs in chunk
    ]
    

    but be advised that this takes a while -- I just timed it at a little over 5 minutes on my PC.