Search code examples
pythoncombinationspermutationprobabilitycombinatorics

list of permutations of HEADS and TAILS in n tosses with exactly k HEADS in Python


I'm trying to write some python code to list the different permutations of heads and tails given n tosses and exactly k heads. E.g. for n=6 tosses and k=3 heads, the output that I want is ['000111', '001011', '001101', '001110', '010011', '010101', '010110', '011001', '011010', '011100', '100011', '100101', '100110', '101001', '101010', '101100', '110001', '110010', '110100', '111000'] (Note: this is not a question about calculating probability, no issues there. My issue is with generating all of the different permutations for n tosses and exactly k heads).

I know that the number of final strings is given by the binomial coefficient aka combination aka n_choose_k (e.g. scipy.special.comb). However, I'm having trouble generating each of these different options for more than a modest (n>10) number of tosses. My first naive approach is to generate all possible permutations of a string such as "1111100000" (using itertools.permutations), and then I remove duplicates using set(). This works for low n (n<=10). However, this is crazy inefficient. Even with n=10 & k=5, the number of permutations is 3,628,800 while number of combinations is only 252. After n>10 this grinds to a halt. I know that itertools also has combinations generator (and _with_replacements). However, I'm not sure if it's possible to use it to generate output in the way that I require, since my problem isn't exactly selecting a subset of k elements from a set of length n, where order does not matter.

My initial code is below.

from itertools import permutations
import scipy.special

n = 10
k = n//2
n_choose_k = scipy.special.comb(n, k)
print('{} choose {} = {}'.format(n, k, n_choose_k))

U,D = '1','0'
seed = U * k + D * (n-k)
permlist = [''.join(p) for p in permutations(seed)]
permset = sorted(list(set(permlist)))
print(permset)
print('len permutations:{}, len set:{}'.format(len(permlist), len(permset)))

Note: I'm interesting in n tosses and exactly k heads, though I would be curious to see how the solution could be expanded to at least k heads too.


UPDATE:

I've accepted karl-knechtel's answer. But for those curious, here are all three methods with run times (on laptop with i7-9750H @2.6Ghz)

Results:

10 choose 5 = 252.0
Permutations: 252 items. Calculated in 1.0340404510498047s
Combinations: 252 items. Calculated in 0.00103759765625s
Binary ints: 252 items. Calculated in 0.002962350845336914s

12 choose 6 = 924.0
Permutations: Gave up after 30 seconds.
Combinations: 924 items. Calculated in 0.001998424530029297s
Binary ints: 924 items. Calculated in 0.023003339767456055s

14 choose 7 = 3432.0
Permutations: Gave up after 30 seconds.
Combinations: 3432 items. Calculated in 0.011020183563232422s
Binary ints: 3432 items. Calculated in 0.10396194458007812s

16 choose 8 = 12870.0
Permutations: Gave up after 30 seconds.
Combinations: 12870 items. Calculated in 0.049001455307006836s
Binary ints: 12870 items. Calculated in 0.42699623107910156s

20 choose 10 = 184756.0
Permutations: Gave up after 30 seconds.
Combinations: 184756 items. Calculated in 0.6319949626922607s
Binary ints: 184756 items. Calculated in 7.8870697021484375s

22 choose 11 = 705432.0
Permutations: Gave up after 30 seconds.
Combinations: 705432 items. Calculated in 2.974030017852783s
Binary ints: Gave up after 30 seconds.

24 choose 12 = 2704156.0
Permutations: Gave up after 30 seconds.
Combinations: 2704156 items. Calculated in 11.795861721038818s
Binary ints: Gave up after 30 seconds.

26 choose 13 = 10400600.0
Permutations: Gave up after 30 seconds.
Combinations: 10400600 items. Calculated in 51.053600549697876s
Binary ints: Gave up after 30 seconds.

Code:

import numpy as np
from itertools import permutations, combinations
import scipy.special
import time

n = 26
k = n//2

test_permutations = n <= 10
test_combinations = True
test_binaryints = n <= 20

test_print_outputs = n <= 6

n_choose_k = scipy.special.comb(n, k)
print('{} choose {} = {}'.format(n, k, n_choose_k))

U,D = '1','0'

def calc_permutations():
    seed = U * k + D * (n-k)
    l = [''.join(p) for p in permutations(seed)]
    return list(set(l))


def calc_combinations():
    '''Suggestion from https://stackoverflow.com/users/523612/karl-knechtel
    https://stackoverflow.com/a/63100745/214488
    '''
    def bit_strings(size, one_count):
        for one_indices in combinations(range(size), one_count):
            yield ''.join(U if i in one_indices else D for i in range(size))

    return list(bit_strings(n, k))


def calc_binaryints():
    '''
    Suggestion from https://stackoverflow.com/users/9046247/aramakus
    https://stackoverflow.com/a/63100262/214488
    '''
    def check_bits(num, k, n):
        """ True if `k` ones inside the number.
        From https://stackoverflow.com/users/7505395/patrick-artner
        https://stackoverflow.com/a/63100497/214488
        """
        return sum( (num & 2**rank)>>rank for rank in range(n)) == k
        
    def to_bin(i):
        return format(i, '0'+str(n) + 'b')

    start_int = 2**k-1
    end_int = 2**n
    l = [to_bin(i) for i in range(start_int, end_int) if check_bits(i,k,n)]
    return l


def test_method(name, fn, do_test, sort=True):
    print(name + ':', end='')
    if do_test:
        start_time = time.time()
        l = fn()
        duration = time.time() - start_time
        if sort: l.sort()
        print(' {} items. Calculated in {}s'.format(len(l), duration))
        if test_print_outputs: print(l)
    else:
        print(' Gave up after 30 seconds.')

test_method('Permutations', calc_permutations, test_permutations)
test_method('Combinations', calc_combinations, test_combinations)
test_method('Binary ints', calc_binaryints, test_binaryints)

Solution

  • However, I'm not sure if it's possible to use it to generate output in the way that I require, since my problem isn't exactly selecting a subset of k elements from a set of length n, where order does not matter.

    It is, actually, with a bit of lateral thinking. The set you want to select from is the set of indices where a 1 should appear in a given output.

    So: use itertools.combinations to determine the indices where the 1s should go (we are choosing k values from the n possible index values - 0 to n-1, inclusive - without replacement; that is exactly what combination are), and then generate the string for each. For example, as a generator:

    def bit_strings(size, one_count):
        for one_indices in itertools.combinations(range(size), one_count):
            yield ''.join('1' if i in one_indices else '0' for i in range(size))
    
    >>> len(list(bit_strings(20, 10))) # takes a bit less than a second on my machine
    184756
    

    This is, of course, still exponentially (literally!) slower than just calculating the number of combinations directly.