Search code examples
cfileio

How to write portable binary files in C?


Let us consider the following pice of code:

#include <stdio.h>
int main(){
    int val = 30;
    FILE *file;
    if(!(fopen("file.bin","wb"))){
        fwrite(&val,sizeof(int),1,file);
        fclose(file);
    }
    return 0;
}

I was wondering about what happens if I try to read the file resulting from this code with fread in an architecture where integers have a different size from the integers in the architecture that produced the file. I think that the result will not match the original value of the variable val in this code.

If this is true, how can we deal with this problem? How can we produce portable binary files in C?


Solution

  • I was wondering about what happens if I try to read the file in an architecture where integers have a different size from the integers in the architecture that produced the file.

    That is absolutely a good thing to worry about. The other big concern is byte order.

    When you say

    fwrite(&val, sizeof(int), 1, file);
    

    you are saying, "write this int to the file in binary, using exactly the same representation it has in memory on my machine: same size, same byte order, same everything". And, yes, that means the file format is essentially defined by "the representation it has on my machine", not in any nicely-portable way.

    But that's not the only way to write an int to a file in binary. There are lots of other ways, with varying degrees of portability. The way I like to do it is simply:

    putc((val >> 8) & 0xff, file);     /* MSB */
    putc( val       & 0xff, file);     /* LSB */
    

    For simplicity here I'm assuming that the binary format being written uses two bytes (16 bits) for the on-disk version of the integer, meaning I'm assuming that the int variable val never holds a number bigger than that.

    Written that way, the two-byte integer is written in "big endian" order, with the most-significant byte first. If you want to define your binary file format to use "little endian" order instead, the change is almost trivial:

    putc( val       & 0xff, file);     /* LSB */
    putc((val >> 8) & 0xff, file);     /* MSB */
    

    You would use some similar-looking code, involving calls to getc and some more bit-shifting operations, to read the bytes back in on the other end and recombine them into an integer. Here's the little-endian version:

    val  = getc(file);
    val |= getc(file) << 8;
    

    These examples aren't perfect, and are guaranteed to work properly for all values only if val is an unsigned type. There are more wrinkles we might apply in order to deal with signed integers, and integers of size other than two bytes, but this should get you started.

    See also questions 12.42 and 16.7 in the C FAQ list. See also this chapter of some on-line C Programming notes.