Search code examples
c++structoperator-overloadingfstreambinary-data

segmentation fault when try reading binary file using overloaded operator >>


I am trying read a binary file which was created with code like that:

#include <list>
#include <string>
#include <bitset>
#include <fstream>

struct Node {
  char data;
  int frequency;

  friend std::istream& operator>>(std::istream& input, Node& e) {
    input.read(&e.data, sizeof(e.data));
    input.read(reinterpret_cast<char*>(e.frequency), sizeof(e.frequency));
    return input;
  };

  friend std::ostream& operator<<(std::ostream& output, const Node& e) {
    output.write(&e.data, sizeof(e.data));
    output.write(reinterpret_cast<const char*>(&e.frequency), sizeof(e.frequency));
    return output;
  };
};

int main() {
  std::list<struct Node> list;

  for(char i='a'; i<='z'; i++) {
    struct Node n;
    n.data = i;
    n.frequency = 1;
    list.push_back(n);
  }

  std::string encoded_file = "";

  for(int i=0; i<100; i++) {
    encoded_file = encoded_file + "101010";
  }

  std::fstream output;
  output.open("output.txt", std::ios_base::out | std::ios_base::binary);

  if (output.is_open()) {
    output << list.size();
    for(int i=0; i<list.size(); i++) {
      output << list.front();
      list.pop_front();
    }

    for(long unsigned int i=0; i<encoded_file.length(); i+=8) {
      std::string data = encoded_file.substr(i, 8);
      std::bitset<8> b(data);
      unsigned long x = b.to_ulong();
      unsigned char c = static_cast<unsigned char>( x );
      output << c;
    }
  }

  output.close();

  return 0;
}

This code seems to work fine, and the file output.txt is generated without problems.

But, when I try reading the file with this code:

#include <list>
#include <string>
#include <bitset>
#include <fstream>
#include <iostream>

struct Node {
  char data;
  int frequency;

  friend std::istream& operator>>(std::istream& input, Node& e) {
    input.read(&e.data, sizeof(e.data));
    input.read(reinterpret_cast<char*>(e.frequency), sizeof(e.frequency));
    return input;
  };

  friend std::ostream& operator<<(std::ostream& output, const Node& e) {
    output.write(&e.data, sizeof(e.data));
    output.write(reinterpret_cast<const char*>(&e.frequency), sizeof(e.frequency));
    return output;
  };
};

int main() {
  std::list<struct Node> list;
  std::string encoded_file = "";

  std::fstream input;
  input.open("output.txt", std::ios_base::in | std::ios_base::binary);

  if (input.is_open()) {
    std::cout << "1" << std::endl;
    int size = 0;
    input >> size;
    std::cout << "size: " << size << std::endl;

    for(int i=0; i<size; i++) {
      Node node;
      input >> node;
      std::cout << node.data << " (" << node.frequency << ")" << std::endl;
      list.push_back(node);
    }
    std::cout << "2" << std::endl;

    char c;
    while(input.get(c)) {
      std::bitset<8> b(c);
      encoded_file = encoded_file + b.to_string();
    }
    std::cout << "3" << std::endl;
  }

  input.close();

  return 0;
}

I get a segmentation fault. The error occurs when I try to execute input >> node;. I checked, and apparently when the program enters Node::operator>>, e.data is read, but frequency is not.

Can anyone give me any tips of how to fix this?


Solution

  • You have a typo in your Node::operator>>. When reading e.frequency, you are missing a &:

    input.read(reinterpret_cast<char*>(&e.frequency), sizeof(e.frequency));
                                       ^
    

    That was a typo in my previous answer where you got this code from. I have corrected that mistake.


    With that said, I see other problems in your code.

    You are creating the file in binary mode, but you are not streaming everything into it as binary values. Only your Node class is being streamed as binary, but your other values are being streamed as text instead. Don't mix formatting schemes.

    output << list.size() and output << c are performing formatted I/O. Use write() instead of operator<< for binary I/O, eg:

    size_t size = list.size();
    output.write(reinterpret_cast<char*>(&size), sizeof(size));
    ...
    unsigned char c = ...;
    output.write(reinterpret_cast<char*>(&c), sizeof(c));
    

    And then reverse that process when reading the file, eg:

    input.read(reinterpret_cast<char*>(&size), sizeof(size));
    ...
    unsigned char c;
    while(input.read(reinterpret_cast<char*>(&c), sizeof(c))) {
         ...
    }
    

    Also, in your code that is creating the file, your 1st for loop is wrong. You are modifying the list while looping through it, thus affecting its size(). So you end up skipping nodes in the list. You should iterate using iterators instead of indexes, and without modifying the list at all, eg:

    for(std::list<Node>::const_iterator iter = list.cbegin(); iter != list.cend(); ++iter) {
        output << *iter;
    }
    

    Or simpler:

    for(const auto &item : list) {
        output << item;
    }