Search code examples
c++windowsnewlineiostreamofstream

Different char values require different sizes in file


I have this code snippet to write a buffer to a file

int WriteBufferToFile(std::string path, const char* buffer, int bufferSize) {
    std::ofstream ofs;
    ofs.open(path);

    if (!ofs) {
        return 1;
    }

    ofs.write(buffer, bufferSize);    

    if (!ofs) {
        return 2;
    }

    ofs.close();

    return 0;
}

Case 1

std::vector<char> buffer(1000000, 0);

WriteBufferToFile("myRawData", buffer.data(), 1000000);

Case 2

std::vector<char> buffer(1000000);

for (int i = 0; i < 1000000; i++) {
    buffer[i] = char(i);
}

WriteBufferToFile("myRawData2", buffer.data(), 1000000);

In Case 1 one I'm writing 1mb of just zeros to a file, which also will have 1mb in size, but in the second case i write arbitary chars (still should be 1mb in RAM) to a file, but now (in my tests it seems like especially when char's >= 10 are contained) the file size increases.

enter image description here

Why is that, and is there a way to fix this?


Solution

  • I'm going to take a wild guess and say that you're running this code on a Windows system.

    Here's what I think is probably happening.

    ofs.open(path) is opening the file in text mode. On Windows, text mode means that every newline character (1 byte) will be replaced by a CRLF sequence (2 bytes). Your buffer contains 1 million characters which are filled with the values 0 to 999999 modulo 256. So 1 on 256 characters (3906 to be exact) will be replaced by a 2 byte sequence which accounts for the file size difference.

    To fix this, open the file in binary mode:

    ofs.open(path, ios_base::out | ios_base::binary)