Search code examples
c++structoperator-overloadingfstreamunsigned-char

convert struct to unsigned char through overloaded operator << and >> (see update)


I have this struct with 2 attributes (one char and one int, for a memory usage of 3 bytes:

struct Node {
    char data;
    int frequency;
}

I try overload the operators << and >> for this struct, for being able to read and write this struct from and into a file using fstream . For the operator << I got:

  friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    string data = string(1, e.data) + to_string(e.frequency);
    output << data.data();
    return output;
  };

which makes wondering how much space this returns to the output (3 bytes, as expected? - 1 from the char and 2 from the int?)

when I want save the struct to the file, I got this:

List<Node> inOrder = toEncode.inOrder();
for(int i=1; i<=inOrder.size(); i++) {
  output << inOrder.get(i)->getData();

where each node of the list inOrder and the tree toEncode above are the struct listed before, and iNOrder.get(i)->getData() return it. output is the fstream.

Now, how I do the reading from the file? with the operator >>, what I understand is that it need take an unsigned char array with 3 elements as input, and take the first element (1 byte) and convert to char, and the 2 other elements and convert for an int. Is this correct? Do I can do that with this method:

  friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    ...
  };

or I need change the method signature (and parameters)? And for the file reading itself, taking in consideration the characters I need to read are all in the first line of the file, what the code for make the program read 3 characters each time from this line and generating a struct from this data?

update

what I got so far:

for write to file, I implement this operator <<:

  friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
    union unsigned_data data;
    data.c = e.data;

    union unsigned_frequency frequency;
    frequency.f = e.frequency;

    output << data.byte << frequency.byte;
    return output;
  };

used this way:

List<HuffmanNode> inOrder = toEncode.inOrder();
for(int i=1; i<=inOrder.size(); i++)
  output << inOrder.get(i)->getData();

this seems to work, but I can't be sure without a way to read from the file, which I got this:

operator>>:

  friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
    union unsigned_data data;
    data.c = e.data;

    union unsigned_frequency frequency;
    frequency.f = e.frequency;

    input >> data.byte >> frequency.byte;
    return input;
  };

used this way:

string line;
getline(input, line);

HuffmanNode node;
stringstream ss(line);
long pos = ss.tellp();
do {
  ss >> node;
  toDecode.insert(node);
  ss.seekp (pos+3);
} while(!ss.eof());

this seems to get stuck on a infinite loop. both operator are using this unions:

union unsigned_data {
  char c;
  unsigned char byte;
};

union unsigned_frequency {
  int f;
  unsigned char byte[sizeof(int)];
};

Solution

  • how much space this returns to the output (3 bytes, as expected? - 1 from the char and 2 from the int?)

    No. You are converting the values to std::strings, so they have variable lengths depending on the particular values (ie, "123" takes up a different length than "1234567890"). What you describe applies to the binary storage of the values, not to the textual representation of the values.

    Now, how I do the reading from the file? with the operator >>, what I understand is that it need take an unsigned char array with 3 elements as input, and take the first element (1 byte) and convert to char, and the 2 other elements and convert for an int. Is this correct?

    No. operator<< and operator>> are primarily meant to be used for formatted (textual) I/O. Your operator<< is actually writing formatted output (though, you don't need to convert the values to std::strings first, you can write them as-is using relevant overloads of operator<<). You just need to write the formatted data in such a way that your operator>> can reverse it. For example:

    friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
        output << int(e.data) << ' ' << e.frequency << ' ';
        return output;
    }
    
    friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
        int i;
        input >> i >> e.frequency;
        e.data = char(i);
        input.ignore();
        return input;
    }
    

    Alternatively:

    friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
        output << e.data << e.frequency << '\n';
        return output;
    }
    
    friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
        e.data = input.get();
        input >> e.frequency;
        input.ignore();
        return input;
    }
    

    The formatting is really up to you, based on your particular needs.

    However, the operators can also be used to read/write binary data, too (just be sure to open the streams in binary mode), eg:

    friend std::ostream& operator<<(std::ostream& output, const HuffmanNode& e) {
        output.write(&e.data, sizeof(e.data));
        output.write(reinterpret_cast<const char*>(&e.frequency), sizeof(e.frequency));
        return output;
    }
    
    friend std::istream& operator>>(std::istream& input, HuffmanNode& e) {
        input.read(&e.data, sizeof(e.data));
        input.read(reinterpret_cast<char*>(&e.frequency), sizeof(e.frequency));
        return input;
    }
    

    This is more in line with what you were thinking of.