Search code examples
cstructsizeoffwrite

Write raw struct contents (bytes) to a file in C. Confused about actual size written


Basic question, but I expected this struct to occupy 13 bytes of space (1 for the char, 12 for the 3 unsigned ints). Instead, sizeof(ESPR_REL_HEADER) gives me 16 bytes.

typedef struct {
  unsigned char version;
  unsigned int  root_node_num;
  unsigned int  node_size;
  unsigned int  node_count;
} ESPR_REL_HEADER;

What I'm trying to do is initialize this struct with some values and write the data it contains (the raw bytes) to the start of a file, so that when I open this file I later I can reconstruct this struct and gain some meta data about what the rest of the file contains.

I'm initializing the struct and writing it to the file like this:

int esprime_write_btree_header(FILE * fp, unsigned int node_size) {
  ESPR_REL_HEADER header = {
    .version       = 1,
    .root_node_num = 0,
    .node_size     = node_size,
    .node_count    = 1
  };

  return fwrite(&header, sizeof(ESPR_REL_HEADER), 1, fp);
}

Where node_size is currently 4 while I experiment.

The file contains the following data after I write the struct to it:

-bash$  hexdump test.dat
0000000 01 bf f9 8b 00 00 00 00 04 00 00 00 01 00 00 00
0000010

I expect it to actually contain:

-bash$  hexdump test.dat
0000000 01 00 00 00 00 04 00 00 00 01 00 00 00
0000010

Excuse the newbiness. I am trying to learn :) How do I efficiently write just the data components of my struct to a file?


Solution

  • Microprocessors are not designed to fetch data from arbitrary addresses. Objects such as 4-byte ints should only be stored at addresses divisible by four. This requirement is called alignment.

    C gives the compiler freedom to insert padding bytes between struct members to align them. The amount of padding is just one variable between different platforms, another major variable being endianness. This is why you should not simply "dump" structures to disk if you want the program to run on more than one machine.

    The best practice is to write each member explicitly, and to use htonl to fix endianness to big-endian before binary output. When reading back, use memcpy to move raw bytes, do not use

    char *buffer_ptr;
    ...
    ++ buffer_ptr;
    struct.member = * (int *) buffer_ptr; /* potential alignment error */
    

    but instead do

    memcpy( buffer_ptr, (char *) & struct.member, sizeof struct.member );
    struct.member = ntohl( struct.member ); /* if member is 4 bytes */