Search code examples
c++bit-manipulationostreamistream

How to save (and retrieve) to file a sequence of bits


I'm trying to store in a file a sequence of bits.

I try to describe only the essential:

  • I have a vector (I know, not a good idea, but I only use it briefly)
  • I want to store it in a file (I'm using Linux)
  • I want to retrieve it from said file and recreate the vector

Since C++ doesn't allow the storing of single bits, I had to group all the bits in char and save the char as a "text". To do so I used this http://www.avrfreaks.net/forum/tut-c-bit-manipulation-aka-programming-101

If the number of bits is multiple of 8, everything works fine. If this is not the case, I don't know how to handle the problem.

I'll explain better. I have:

010011000110111101110010011001010110110101

I save the chars as:

01001100 -> L
01101111 -> o
01110010 -> r
01100101 -> e
01101101 -> m
01

That last "01"... I don't know how to store it. Of course I could create a byte with a 1 and some 0 padding... but I don't know the number of "extra bits" when I retrieve them! What is padding and what is info?

I simply don't know how to do this... any idea?

Some code for the file writer (Not my actual code... its too long... I wrote only the important parts...):

void Compressor::compress(std::istream &is, std::ostream &os) {
  queue<bool> bit_buffer;
  char c;

  while (is.get(c)) {
      new_letter = c;
      const std::vector<bool> bit_c = char2bits(new_letter);
      for(bool bit : bit_c) 
        bit_buffer.push(bit);
  }
  //Here my code adds a certain number of bits, I simulate this with:
  bit_buffer.push(false);
  bit_buffer.push(true);

  // Write the bit buffer into a file
  while (bit_buffer.size() >= 8) {

    // Group vector<bool> in char
    char output = 0;
    for (int i=0; i<8; i++) {
      int bit = bit_buffer.front();
      bit_buffer.pop();
      if (bit) bit_set(output, BIT(i));
      else bit_clear(output, BIT(i));
    }

    // Individually write chars in file
    os.write(&output,sizeof(char));
  }

  //????????
  //Last bits???
  //????????
}

vector<bool> char2bits (char c) {
  bitset<8> bit_c (c);
  vector<bool> bool_c;
  for (int i=7; i>=0; i--) {
    bool_c.push_back(bit_c[i]);
  }
  return bool_c;
}

Solution

  • One way to do bit padding is to pad with 10...0.

    So 01 gets padded to 01100000.

    When decoding, just ignore everything behind the last 1.

    If you have a full byte in the end, pad with 10000000.