Search code examples
c++saveshort

Save short int in binary file instead of text file


Let's say I have a vector with 9 integers.

in total, I should have 36 bytes.

some of these integers fit in the size of a short, so I wanna store the ones who fit as short in 2 bytes and the ones who don't, in 4.

I noticed that a file with 120 98 99 99 98 257 259 98 0 was 28 bytes and I wonder what I did wrong.

ofstream out(file, ios::binary);
int len = idx.size();                    //idx is the vector<int>
string end = " 0", space = " ";          //end is just to finish the saving.
for(int i = 0; i < len; i++) {
    if(idx[i] <= SHRT_MAX){
        short half = idx[i];
        out<<half;
    }
    else out<<idx[i];
    if(i == len-1) out<<end; else out<<space;
}

Solution

  • First piece of advice, use the header cstdint if you want to work with types of a guaranteed size. Types such as uint16_t are standard and are there for a reason.

    Next, this idea of sometimes writing two bytes and sometimes writing four. Keep in mind that when you write data to a file like this, it's just going to look like a big chunk of data. There will not be any way to magically know when to read two bytes and when to read four. You can store metadata about the file, but that would probably be more inefficient than simply just consistently using the same size. Write everything as two bytes or four bytes. That's up to you, but whatever it is you should probably stick with it.

    Now, moving on to why you have 28 bytes of data written.

    You're writing the ASCII representations of your numbers. This ends up being "120 98 99 99 98 257 259 98 9" which has a size of 28 bytes.

    When writing your data, you probably want to do something like

    out.write( (char*)&my_data, sizeof(my_data));
    

    Keep in mind though this isn't really a safe way to write binary data. I think you already understand the necessity to make sure you write the size you intend. Sadly the complications with creating portable files doesn't end there. You also need to worry about the endianess of the machine your program is running on. This is an article that I think you might enjoy reading to learn more about the subject.

    Disch's Tutorial To Good Binary Files