I have lots of data which I would like to save to disk in binary form and I would like to get as close to having ACID properties as possible. Since I have lots of data and cannot keep it all in memory, I understand I have two basic approaches:
So my question is specifically:
If I choose to go for the large file option and open it as a memory mapped file (or using Stream.Position
and Stream.Write
), and there is a loss of power, are there any guarantees to what could possibly happen with the file?
Is it possible to lose the entire large file, or just end up with the data corrupted in the middle?
Does NTFS ensure that a block of certain size (4k?) always gets written entirely?
Is the outcome better/worse on Unix/ext4?
I would like to avoid using NTFS TxF since Microsoft already mentioned it's planning to retire it. I am using C# but the language probably doesn't matter.
(additional clarification)
It seems that there should be a certain guarantee, because -- unless I am wrong -- if it was possible to lose the entire file (or suffer really weird corruption) while writing to it, then no existing DB would be ACID, unless they 1) use TxF or 2) make a copy of the entire file before writing? I don't think journal will help you if you lose parts of the file you didn't even plan to touch.
You can call FlushViewOfFile
, which initiates dirty page writes, and then FlushFileBuffers
, which according to this article, guarantees that the pages have been written.
Calling FlushFileBuffers
after each write might be "safer" but it's not recommended. You have to know how much loss you can tolerate. There are patterns that limit that potential loss, and even the best databases can suffer a write failure. You just have to come back to life with the least possible loss, which typically demands some logging with a multi-phase commit.
I suppose it's possible to open the memory mapped file with FILE_FLAG_NO_BUFFERING
and FILE_FLAG_WRITE_THROUGH
but that's gonna suck up your throughput. I don't do this. I open the memory mapped files for asynchronous I/O, letting the OS optimize the throughput with it's own implementation of async I/O completion ports. It's the fastest possible throughput. I can tolerate potential loss, and have mitigated appropriately. My memory mapped data is file backup data...and if I detect loss, I can can detect and re-backup the lost data once the hardware error is cleared.
Obviously, the file system has to be reliable enough to operate a database application, but I don't know of any vendors that suggest you don't still need backups. Bad things will happen. Plan for loss. One thing I do is that I never write into the middle of data. My data is immutable and versioned, and each "data" file is limited to 2gb, but each application employs different strategies.