Search code examples
c#data-structuresformatbinaryreader

Gathering binary data from a file in C# and editing or adding data without writing the whole file again


I know this sounds kind of confusing. but i was wondering if there is a way to maintain the structure of the file and editing it even if it's adding data at some part of the file or editing a value from a certain position.

What i do right now to edit binary files is to code the parser with the BinaryReader class (in C#), reading a certain structure with reader.readSingle, readInt, and so on.

Then i write the exact same thing with BinaryWriter, which seems kind of inefficent and maybe i can make mistakes and making differences between both reader and writer, making the format inconsistent.

Is there any sort of way, to define the file structure and do the whole process automatically for reading and writing with a single format definition? Or being able to open a file, edit some values of it, (or adding, since it's not a fixed format, reading it would imply some for loops for example), and saving those changes?

I hope i explained myself in a sightly understandable way


Solution

  • If you want to insert new data into a binary file, you have three options:

    1. Move everything from that point forward down a bit so that you make space for the new data.
    2. Somehow mark the existing data as no longer relevant (i.e. a deleted flag), and add the new data at the end of the file.
    3. Replace the existing data with a pointer to another location in the file (typically the end of the file) where the new data is stored.

    The first method requires rewriting the entire file.

    The second method can work well if it's a file of records, for example, and if you don't depend on the order of records in the file. It becomes more difficult if the file has a complex structure of nested records, etc. It has the drawback of leaving a lot of empty space in the file.

    The third method is similar to the second, but works well if you're using random access rather than sequential access. It still ends up wasting space in the file.