Search code examples
pythonencodingintegercombinationsbinomial-coefficients

Python - Encoding all possible 5-combinations into an integer


I've got a set of 25 integers, ranging from 0 to 24 and my application involves selecting 5 of them (there are no repeated values, no value can be selected more than once) obtaining a combination like this one [15, 7, 12, 3, 22]. It's important to consider the the previous combination is considered equal to [7, 22, 12, 15, 3], the order doesn't matter, only the values do.

By applying the binomial coefficient (25 choose 5) we can find out that there are 53.130 possible combinations. I want to encode all the possible combinations into an integer so that all values from 0 to 53129 are linked to a combination.


Solution

  • Use more_itertools.nth_combination that can compute the nth combination without computing all previous ones:

    # pip install more-itertools
    from more_itertools import nth_combination
    
    nth_combination(range(25), 5, 0)
    # (0, 1, 2, 3, 4)
    
    nth_combination(range(25), 5, 42)
    # (0, 1, 2, 5, 7)
    
    nth_combination(range(25), 5, 53129)
    # (20, 21, 22, 23, 24)
    

    You can make things more interesting by combining the above with functools.partial/cache:

    from functools import partial, cache
    
    encode = partial(nth_combination, range(25), 5)
    # or with a cache
    # encode = cache(partial(nth_combination, range(25), 5))
    
    encode(0)
    # (0, 1, 2, 3, 4)
    
    encode(42)
    # (0, 1, 2, 5, 7)
    
    encode(53129)
    # (20, 21, 22, 23, 24)
    

    efficiency

    The advantage of nth_combination is that for large ranges, it is not needed to compute all n-1 combinations to access the nth one. Also, it is not needed to store all combinations, making it both CPU and memory efficient. Combined with cache this makes a compromise between memory and CPU by avoiding to recompute the same value twice if the same code is requested several times.

    However, if you must access all values eventually, then pre-computing all combinations as show by @ti7 will be more straightforward and efficient, but will require to compute and store all values from the beginning:

    from itertools import combinations
    
    encode = list(combinations(range(25), 5))
    
    encode[0]
    # (0, 1, 2, 3, 4)
    
    encode[42]
    # (0, 1, 2, 5, 7)
    
    encode[53129]
    # (20, 21, 22, 23, 24)