So I've indexed the members of the powerset of the alphabet in the following way:
def bytecode_index(some_subset):
mask = 0
for index in bytearray(some_subset):
mask |= (1<<(index-97))
return mask
This might be a bit nonstandard, and improvements with respect to improving it are very much welcome, but the crux of my question is actually as follows:
How can I take two such masks and determine if one is a subset of the other efficiently and pythonically? One such way for determining if index1
is a subset of index2
is to compare their binary strings. if index1
has a 1 where index2
has a 0, the set corresponding to index1
is not a subset of the set corresponding to index2
.
I've written something that looks like this for that:
def compare_binary_strings(index1,index2):
return not any(x == "1" and y == "0" for x,y in zip(bin(index1), bin(index2)))
This seems inefficient, as it involves converting the indices to strings and then comparing them componentwise. Any and all help is much appreciated.
Is there a more simple operation available to quickly compare the two indices?
Well I don't know about Pythonically, but generically the way to check if one bitmask is a subset of the other is:
(x & y) == x
Iff true, x
is a subset of y
.
This just the bitmath equivalent of the familiar
A ⊆ B ⇔ A ∩ B = A