I want to find binary strings of a certain weight. The amount of such strings grows to the point of a memory error, so I'm currently generating them with a generator. This code generators all length n binary strings with weight k:
def kbits(n, k):
for bits in itertools.combinations(range(n), k):
s = ['0'] * n
for bit in bits:
s[bit] = '1'
yield ''.join(s)
for b in kbits(length, weight):
print(b)
So for length = 3 and weight = 2, we get 110, 101, 011.
My research requires me to parse through values such as n = 56 and k = 7, which takes around 24 hours on my device. I'd also like to try n = 72 and k = 8, which (based on the time of the previous result) may take 365 days. So I'm wondering two things:
Is this the quickest (non-memory) intensive way of generating these binary strings?
Is it possible to have multiple cores of my CPU working on this at once? I'm assuming itertools is parsing through a sequence. If (let's say) we had a 2-core CPU, would it be possible to have the first core parse the first 50% of the sequence and the second core to do the latter half?
EDIT:
Perhaps I should mention that for each boolean b, I'd like to perform the following least-squares computation, where N is some defined matrix:
for b in kbits(size, max_coclique):
v = np.linalg.lstsq(N,np.array(list(b), dtype = float))
i.e. I require the ultimate expected output format for b to be a numpy
array with 0/1 values. (That is unless there is some extremely fast way of doing all of this - including the least-squares computation - in a different way.)
Note: I'm also running this in Sage, as I am utilizing its database of transitive groups.
Given a value with weight k, you can get the lexically next value as follows:
This is the binary version of the Pandita algorithm: https://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order
You can do it with bit manipulations like this:
def kbits(n, k):
limit=1<<n
val=(1<<k)-1
while val<limit:
yield "{0:0{1}b}".format(val,n)
minbit=val&-val #rightmost 1 bit
fillbit = (val+minbit)&~val #rightmost 0 to the left of that bit
val = val+minbit | (fillbit//(minbit<<1))-1
There are probably some opportunities for optimization remaining, but the time will be dominated by formatting the values as binary strings in the yield
statement.