Search code examples
pythonbitset

Is there a builtin bitset that's similar to the std::bitset from C++?


I want to use a bit array in Python that I could use like the standard bitset from C++. Example:

#include<bitset>
int main() {
    std::bitset<100> numBits;
}

However, I don't know if there is something similar in Python, preferably built-in.


Solution

  • Thre is nothing built-in for that. If you need such a data structure in order to have a proper output of bytes, with the correct bits set, such as for a network protocol, a binary file structure or hardware control, sequencing a list of True and False values to a sequence of Bytes is easily feasible.

    One could also create a class to allow direct manypulation of in-memory bits in a bytearray object. However, unlikely what takes place in C++, you won't gain speed or memory (ok, for large bitsets you could gain memory) advantages for that - Python will process each bit as a full reference to the True or False objects (or to full 0 and 1 integers) regardless of what you do in code.

    That said, if you have a list with True and False values you want to output to, say, a file, as a sequence of bits, code like this might work:

    a = [True, True, False, False, False, True, ...]
    with open("myfile.bin", "wb" as file):
        for i, value in enumerate(a):
            if not i % 8:
                if i:
                    file.write(byte)
                byte = 0
            byte <<= 1
            byte |= value
         if i % 8:
            byte <<= (8 - i % 8)
            file.write(byte)
    

    A more sophisticated way is to create a full-class support for it, by keeping the values ina bytearray object, and computing each bit index at set and reset operations - a minimalist way of doing that is:

    class BitArray(object):
        def __init__(self, lenght):
            self.values = bytearray(b"\x00" * (lenght // 8 + (1 if lenght % 8  else 0)))
            self.lenght = lenght
    
        def __setitem__(self, index, value):
            value = int(bool(value)) << (7 - index % 8)
            mask = 0xff ^ (7 - index % 8)
            self.values[index // 8] &= mask
            self.values[index // 8] |= value
        def __getitem__(self, index):
            mask = 1 << (7 - index % 8)
            return bool(self.values[index // 8] & mask)
    
        def __len__(self):
            return self.lenght
    
        def __repr__(self):
            return "<{}>".format(", ".join("{:d}".format(value) for value in self))
    

    As you can see, there is no speed gain in doing so, and you'd need a lot of bits to take any benefit of memory savings with that. This is an example of the above class in use at the interactive prompt:

    In [50]: a = BitArray(16)
    
    In [51]: a[0] = 1
    
    In [52]: a[15] = 1
    
    In [53]: a
    Out[53]: <1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1>