I'm trying to write some python code to list the different permutations of heads and tails given n tosses and exactly k heads. E.g. for n=6
tosses and k=3
heads, the output that I want is ['000111', '001011', '001101', '001110', '010011', '010101', '010110', '011001', '011010', '011100', '100011', '100101', '100110', '101001', '101010', '101100', '110001', '110010', '110100', '111000']
(Note: this is not a question about calculating probability, no issues there. My issue is with generating all of the different permutations for n tosses and exactly k heads).
I know that the number of final strings is given by the binomial coefficient aka combination aka n_choose_k (e.g. scipy.special.comb
). However, I'm having trouble generating each of these different options for more than a modest (n>10) number of tosses. My first naive approach is to generate all possible permutations of a string such as "1111100000"
(using itertools.permutations
), and then I remove duplicates using set()
. This works for low n (n<=10
). However, this is crazy inefficient. Even with n=10 & k=5
, the number of permutations is 3,628,800 while number of combinations is only 252. After n>10
this grinds to a halt. I know that itertools
also has combinations
generator (and _with_replacements
). However, I'm not sure if it's possible to use it to generate output in the way that I require, since my problem isn't exactly selecting a subset of k elements from a set of length n, where order does not matter.
My initial code is below.
from itertools import permutations
import scipy.special
n = 10
k = n//2
n_choose_k = scipy.special.comb(n, k)
print('{} choose {} = {}'.format(n, k, n_choose_k))
U,D = '1','0'
seed = U * k + D * (n-k)
permlist = [''.join(p) for p in permutations(seed)]
permset = sorted(list(set(permlist)))
print(permset)
print('len permutations:{}, len set:{}'.format(len(permlist), len(permset)))
Note: I'm interesting in n tosses and exactly k heads, though I would be curious to see how the solution could be expanded to at least k heads too.
I've accepted karl-knechtel's answer. But for those curious, here are all three methods with run times (on laptop with i7-9750H @2.6Ghz)
10 choose 5 = 252.0 Permutations: 252 items. Calculated in 1.0340404510498047s Combinations: 252 items. Calculated in 0.00103759765625s Binary ints: 252 items. Calculated in 0.002962350845336914s 12 choose 6 = 924.0 Permutations: Gave up after 30 seconds. Combinations: 924 items. Calculated in 0.001998424530029297s Binary ints: 924 items. Calculated in 0.023003339767456055s 14 choose 7 = 3432.0 Permutations: Gave up after 30 seconds. Combinations: 3432 items. Calculated in 0.011020183563232422s Binary ints: 3432 items. Calculated in 0.10396194458007812s 16 choose 8 = 12870.0 Permutations: Gave up after 30 seconds. Combinations: 12870 items. Calculated in 0.049001455307006836s Binary ints: 12870 items. Calculated in 0.42699623107910156s 20 choose 10 = 184756.0 Permutations: Gave up after 30 seconds. Combinations: 184756 items. Calculated in 0.6319949626922607s Binary ints: 184756 items. Calculated in 7.8870697021484375s 22 choose 11 = 705432.0 Permutations: Gave up after 30 seconds. Combinations: 705432 items. Calculated in 2.974030017852783s Binary ints: Gave up after 30 seconds. 24 choose 12 = 2704156.0 Permutations: Gave up after 30 seconds. Combinations: 2704156 items. Calculated in 11.795861721038818s Binary ints: Gave up after 30 seconds. 26 choose 13 = 10400600.0 Permutations: Gave up after 30 seconds. Combinations: 10400600 items. Calculated in 51.053600549697876s Binary ints: Gave up after 30 seconds.
import numpy as np
from itertools import permutations, combinations
import scipy.special
import time
n = 26
k = n//2
test_permutations = n <= 10
test_combinations = True
test_binaryints = n <= 20
test_print_outputs = n <= 6
n_choose_k = scipy.special.comb(n, k)
print('{} choose {} = {}'.format(n, k, n_choose_k))
U,D = '1','0'
def calc_permutations():
seed = U * k + D * (n-k)
l = [''.join(p) for p in permutations(seed)]
return list(set(l))
def calc_combinations():
'''Suggestion from https://stackoverflow.com/users/523612/karl-knechtel
https://stackoverflow.com/a/63100745/214488
'''
def bit_strings(size, one_count):
for one_indices in combinations(range(size), one_count):
yield ''.join(U if i in one_indices else D for i in range(size))
return list(bit_strings(n, k))
def calc_binaryints():
'''
Suggestion from https://stackoverflow.com/users/9046247/aramakus
https://stackoverflow.com/a/63100262/214488
'''
def check_bits(num, k, n):
""" True if `k` ones inside the number.
From https://stackoverflow.com/users/7505395/patrick-artner
https://stackoverflow.com/a/63100497/214488
"""
return sum( (num & 2**rank)>>rank for rank in range(n)) == k
def to_bin(i):
return format(i, '0'+str(n) + 'b')
start_int = 2**k-1
end_int = 2**n
l = [to_bin(i) for i in range(start_int, end_int) if check_bits(i,k,n)]
return l
def test_method(name, fn, do_test, sort=True):
print(name + ':', end='')
if do_test:
start_time = time.time()
l = fn()
duration = time.time() - start_time
if sort: l.sort()
print(' {} items. Calculated in {}s'.format(len(l), duration))
if test_print_outputs: print(l)
else:
print(' Gave up after 30 seconds.')
test_method('Permutations', calc_permutations, test_permutations)
test_method('Combinations', calc_combinations, test_combinations)
test_method('Binary ints', calc_binaryints, test_binaryints)
However, I'm not sure if it's possible to use it to generate output in the way that I require, since my problem isn't exactly selecting a subset of k elements from a set of length n, where order does not matter.
It is, actually, with a bit of lateral thinking. The set you want to select from is the set of indices where a 1 should appear in a given output.
So: use itertools.combinations
to determine the indices where the 1s should go (we are choosing k
values from the n
possible index values - 0
to n-1
, inclusive - without replacement; that is exactly what combination are), and then generate the string for each. For example, as a generator:
def bit_strings(size, one_count):
for one_indices in itertools.combinations(range(size), one_count):
yield ''.join('1' if i in one_indices else '0' for i in range(size))
>>> len(list(bit_strings(20, 10))) # takes a bit less than a second on my machine
184756
This is, of course, still exponentially (literally!) slower than just calculating the number of combinations directly.