Search code examples
c++bitmapbit-manipulationsparse-matrix

Is it possible to store (0,1)-matrix with single bits rather than bool?


I need to store fairly large size (1M x 1M) square matrices on a limited memory device. The matrices consist of elements only "1" and "0".

I have read that bool could save 1/0 ( or true/false) but that is also 8 bits storage.So I could have stored 8 times larger size if I could store the values in a single bit.

Instead of going with multidimensional storage I chose to store matrices in single dimension matrix and accesing elements via matrix[row*N + column] where size = N * N

I have a representation below to explain stuff better

+ = 1
o = 0
  0 1 2 3 4 5
0 + o + o o o
1 o + + + o +
2 o o + o + o
3 + o o o o o
4 o o + + o 1

is converted to std::vector<unsigned char> matrix = {1,0,1,0,0,0,...}

I chose the unsigned char to store 1 & 0s and have 8 bits of minimum size in c++ I have the following code as a starting point:


 int each_row;
 int row_val;
  for (int row = 0; row < 8; row++) {
    for (int col = 0; col < 8; col++) {
      
      unsigned char val = matrix[row * N + col];
      each_row = each_row + to_string(val);
      
    }

    row_val = std::stoi(each_row, nullptr, 2); // to int

so my question is how can I store values in each row/col in a single bit of a char to compress data ?

Now each value of val = 0b00000001 or 0b00000000 I want to make use of all bits like the first row and second row combined to be stored as(due to size lower than 8) 0b10100001


Solution

  • This is exactly what std::vector<bool> is for.

    #include <vector>
    #include <string>
    
    
    int main()
    {
      std::string each_row;
      int row_val;
      int N = 8;
      std::vector<bool> matrix(N*N);
      for (int row = 0; row < N; row++) {
        for (int col = 0; col < N; col++) {
          
          unsigned char val = matrix[row * N + col];
          each_row = each_row + std::to_string(val);
          
        }
        row_val = std::stoi(each_row, nullptr, 2); // to int
      }
    }
    

    From cppreference:

    std::vector<bool> is a possibly space-efficient specialization of std::vector for the type bool.

    The manner in which std::vector<bool> is made space efficient (as well as whether it is optimized at all) is implementation defined. One potential optimization involves coalescing vector elements such that each element occupies a single bit instead of sizeof(bool) bytes.

    std::vector<bool> behaves similarly to std::vector, but in order to be space efficient, it:

    • Does not necessarily store its elements as a contiguous array.
    • Exposes class std::vector<bool>::reference as a method of accessing individual bits. In particular, objects of this class are returned by operator[] by value.
    • Does not use std::allocator_traits::construct to construct bit values.
    • Does not guarantee that different elements in the same container can be modified concurrently by different threads.

    Whether a specialization for std::vector<bool> was a good idea is a much discussed issue. See eg Why isn't vector<bool> a STL container?. Though, the differences mainly show up in generic code, where sometimes one has to add special cases, because std::vector<bool> is not like other std::vectors in its details.