Search code examples
c#savecorrupt-data

How to ensure that data doesn't get corrupted when saving to file?


I am relatively new to C# so please bear with me.

I am writing a business application (in C#, .NET 4) that needs to be reliable. Data will be stored in files. Files will be modified (rewritten) regularly, thus I am afraid that something could go wrong (power loss, application gets killed, system freezes, ...) while saving data which would (I think) result in a corrupted file. I know that data which wasn't saved is lost, but I must not lose data which was already saved (because of corruption or ...).

My idea is to have 2 versions of every file and each time rewrite the oldest file. Then in case of unexpected end of my application at least one file should still be valid.

Is this a good approach? Is there anything else I could do? (Database is not an option)

Thank you for your time and answers.


Solution

  • A lot of programs uses this approach, but usually, they do more copies, to avoid also human error.

    For example, Cadsoft Eagle (a program used to design circuits and printed circuit boards) do up to 9 backup copies of the same file, calling them file.b#1 ... file.b#9

    Another thing you can do to enforce security is to hashing: append an hash like a CRC32 or MD5 at the end of the file. When you open it you check the CRC or MD5, if they don't match the file is corrupted. This will also enforce you from people that accidentally or by purpose try to modify your file with another program. This will also give you a way to know if hard drive or usb disk got corrupted.

    Of course, faster the save file operation is, the less risk of loosing data you have, but you cannot be sure that nothing will happen during or after writing.

    Consider that both hard drives, usb drives and windows OS uses cache, and it means, also if you finish writing the data may be OS or disk itself still didn't physically wrote it to the disk.

    Another thing you can do, save to a temporary file, if everything is ok you move the file in the real destination folder, this will reduce the risk of having half-files.

    You can mix all these techniques together.