Search code examples
c++file-iobinaryfilesofstreamrandom-access

Appending to binary file after writing to beginning


I'm creating a binary file of records with the following format:

quantity-of-records  
record_1  
record_2  
...  
record_N  

The issue is that record_1 is overwritten each time, instead of appending.

Writing at EOF after writing at BOF

Here's my simplified code:

#include <fstream>
#include <string>

struct Record
{
    unsigned int    id;
    std::string     text;
};


int main()
{
    static const Record     table[] =
    {
        {
            1, "Apple"
        },
        {
            2, "Salt"
        },
        {
            3, "Margarine"
        },
        {
            4, "Carrot"
        },
        {
            5, "Plum"
        }
    };

    static const size_t records_in_table =
        sizeof(table) / sizeof(table[0]);

    static const char   table_filename[] = "record_file.bin";

    size_t i;
    size_t record_quantity = 1u;
    for (i = 0u; i < records_in_table; ++i)
    {
        std::ofstream   table_file(table_filename,
                                   std::ios::binary);
        table_file.seekp(0, std::ios::beg);
        table_file.write(reinterpret_cast<char *>(&record_quantity),
                         sizeof(record_quantity));
        table_file.flush();

        table_file.seekp(0, std::ios::end);
        table_file.write(reinterpret_cast<char const *>(&table[i].id),
                         sizeof(Record::id));
        const size_t length(table[i].text.length());
        table_file.write(reinterpret_cast<char const *>(&length),
                         sizeof(length));
        table_file.write(table[i].text.c_str(),
                         length);
        table_file.close();
        ++record_quantity;
    }
    return 0;
}

Here's the content of the binary file:

$ od -Ax -x record_file.bin
000000 0005 0000 0000 0000 0005 0000 0004 0000
000010 0000 0000 6c50 6d75
000018

Numbers are written in Little Endian format, 32-bit (4 bytes) 64-bit (8 bytes). The text "Plum" is ASCII encoded as: 0x50, 0x6C, 0x75, 0x6D

Here's the binary file after the first iteration:

$ od -Ax -x record_file.bin
000000 0001 0000 0000 0000 0001 0000 0005 0000
000010 0000 0000 7041 6c70 0065
000019

Environment/Tools:

  • Compilers: Visual Studio 2017, G++ (GCC) 7.4.0 (Cygwin)
  • OS: Windows 7

Opening with mode app

An alternative is to open the file in ios::app mode, writing the new record, then updating the quantity-of-records:

size_t  i;
size_t  record_quantity = 1u;
bool    first_write(true);
for (i = 0u; i < records_in_table; ++i)
{
    std::ofstream   table_file(table_filename,
                               std::ios::binary | std::ios::app);
    if (first_write)
    {
        first_write = false;
        table_file.write(reinterpret_cast<char *>(&record_quantity),
                         sizeof(record_quantity));
        table_file.flush();
        table_file.write(reinterpret_cast<char const *>(&table[i].id),
                         sizeof(Record::id));
        const size_t length(table[i].text.length());
        table_file.write(reinterpret_cast<char const *>(&length),
                         sizeof(length));
        table_file.write(table[i].text.c_str(),
                         length);
    }
    else
    {
        table_file.write(reinterpret_cast<char const *>(&table[i].id),
                         sizeof(Record::id));
        const size_t length(table[i].text.length());
        table_file.write(reinterpret_cast<char const *>(&length),
                         sizeof(length));
        table_file.write(table[i].text.c_str(),
                         length);
        table_file.flush();
        table_file.seekp(0, std::ios::beg);
        table_file.write(reinterpret_cast<char *>(&record_quantity),
                         sizeof(record_quantity));
    }
    table_file.close();
    ++record_quantity;
}

However, with the alternative implementation the quantity-of-records or the first integer in the file, is not updated.
Here is the content of the binary file:

$ od -Ax -x record_file.bin
000000 0001 0000 0000 0000 0001 0000 0005 0000
000010 0000 0000 7041 6c70 0165 0000 0000 0000
000020 0100 0000 0500 0000 0000 0000 4100 7070
000030 656c 0002 0000 0004 0000 0000 0000 6153
000040 746c 0002 0000 0000 0000 0003 0000 0009
000050 0000 0000 0000 614d 6772 7261 6e69 0365
000060 0000 0000 0000 0400 0000 0600 0000 0000
000070 0000 4300 7261 6f72 0474 0000 0000 0000
000080 0500 0000 0400 0000 0000 0000 5000 756c
000090 056d 0000 0000 0000 0000
000099

Question: How can I append a record to the end of the file and update the first integer (at the beginning of the file)?


Solution

  • Root Cause

    The root cause or issue is the mode that the file is opened with. My experiments show that data is only appended when the file is opened with std::ios_base::app. However, most documentation implies that all writes will be appended to the file. Seeking to a position, then writing will still have the data written at EOF.

    In order to write at the beginning of the file, without truncation, the ofstream must be opened with std::ios_base::in and std::ios_base::out attributes.

    Corrected Program

    I have modified my program so that records are lined up on a 16-byte boundary and unused bytes are filled with 0xFF (this makes the hex dump easier to read). All integer data is 32-bit; text is variable length.

    The record data is written first, to append to the file. The file is opened twice, using two different variables, once in each mode.

    #include <fstream>
    #include <string>
    
    struct Table_Quantity_Record
    {
        unsigned int    quantity;
        uint8_t         padding[12];
    };
    
    struct Record
    {
        unsigned int    id;
        std::string     text;
    };
    
    
    int main()
    {
        static const Record     table[] =
        {
            { 0x11111111, "Apple"},
            { 0x22222222, "Salt"},
            { 0x33333333, "Butter"},
            { 0x44444444, "Carrot"},
            { 0x55555555, "Plum"},
        };
    
        static const size_t records_in_table =
            sizeof(table) / sizeof(table[0]);
    
        static const char   table_filename[] = "record_file.bin";
    
        std::remove(&table_filename[0]);
    
        size_t  i;
        Table_Quantity_Record   quantity_record;
        quantity_record.quantity = 1;
        std::fill(&quantity_record.padding[0],
                  &quantity_record.padding[12],
                  0xffu);
        static const uint8_t    padding_bytes[16] = {0xFFu};
        for (i = 0; i < records_in_table; ++i)
        {
            // Open the file in append mode, and append the new data record.
            std::ofstream   data_file(&table_filename[0],
                                      std::ios_base::binary | std::ios_base::app | std::ios_base::ate);
            if (data_file)
            {
                data_file.write((char *) &table[i].id, sizeof(Record::id));
                const unsigned int length = table[i].text.length();
                data_file.write((char *) &length, sizeof(length));
                data_file.write(table[i].text.c_str(), length);
                data_file.flush();
                const unsigned int padding_qty =
                    16 - sizeof(Record::id) - sizeof(length) - length;
                static const uint8_t pad_byte = 0xFFU;
                for (size_t j = 0; j < padding_qty; ++j)
                {
                    data_file.write((char *) &pad_byte, sizeof(pad_byte));
                }
                data_file.flush();
                data_file.close();
            }
    
            // Open the data file with "in" attribute to write the record quantity
            // at the beginning of the file.
            std::ofstream   table_file(&table_filename[0],
                                       std::ios_base::binary | std::ios_base::in);
            table_file.write((char *) &quantity_record, sizeof(quantity_record));
            table_file.flush();
            table_file.close();
            ++quantity_record.quantity;
        }
        return 0;
    }
    

    Content of Binary File

    $ od -Ax -x record_file.bin
    000000 0005 0000 ffff ffff ffff ffff ffff ffff
    000010 2222 2222 0004 0000 6153 746c ffff ffff
    000020 3333 3333 0006 0000 7542 7474 7265 ffff
    000030 4444 4444 0006 0000 6143 7272 746f ffff
    000040 5555 5555 0004 0000 6c50 6d75 ffff ffff
    000050
    

    enter image description here

    Note: record ID values have changed since the program in the question, to facilitate locating of the records.