Search code examples
pythonpython-itertools

Is there a way to avoid going through all the possible combinations generated by itertools.combinations() in python?


I am trying to generate 300 randomized sets/lines of 16 numbers each, from 1 to 64 using Python.

I'm using the itertools package to generate combinations, and this is my code:

import itertools
import random


def generate_combinations():
    combinations = list(itertools.combinations(range(1, 64), 16))
    random.shuffle(combinations)
    combinations = combinations[:300]
    return print(*combinations, sep = "\n") 

Based on my code, the combinations list is generated using the itertools.combinations() function, then those combinations are shuffled and lastly, I limit the list length to 300.

The issue is the time it takes to get the ~488,526,937,079,580 combinations in the first step. Is there any way I can achieve this more efficiently?


Solution

  • Any approach actually generating all those combos will run out of memory (300 trillion+), but there is a fast way to generate the nth combo using an itertools recipe.

    import math
    
    def nth_combination(iterable, r, index):
        "Equivalent to list(combinations(iterable, r))[index]"
        pool = tuple(iterable)
        n = len(pool)
        c = math.comb(n, r)
        if index < 0:
            index += c
        if index < 0 or index >= c:
            raise IndexError
        result = []
        while r:
            c, n, r = c*r//n, n-1, r-1
            while index >= c:
                index -= c
                c, n = c*(n-r)//n, n-1
            result.append(pool[-1-n])
        return tuple(result)
    

    Now you can just generate random indices and get the result:

    import random
    
    n = 64
    r = 16
    iterable = range(1, n)
    n_combos = math.comb(len(iterable), r)
    indices = random.sample(range(n_combos), k=300)  # random.choices if dupes are ok
    combos = [nth_combination(iterable, r, i) for i in indices]