For my application, I am attempting to determine whether a data backup system missed any writes. I am doing this by writing an incrementing integer counter to a 1GB virtual disk, and to make sure no writes were missed I can look at the reverted snapshot and see if there were any gaps (i.e. if I see 1, 2, 3, 0, 0, 6, 7 I know that the backup didn't get writes 4 and 5 correctly). This is all on a CentOS 7 VM, with mostly Python 2.7 scripts for writes/reads (speed isn't a huge concern)
A big part of my issues has been caching: since I'm simulating random I/O, writes are often flushed from caches and written to disk out of order. This makes every test appear as a false positive, since it looks like some data is missing at the time of the snapshot. Again, I don't really care about efficiency at all, so I don't mind really slow writes. Reads can use caching, that's not a problem, but also doesn't matter much one way or the other
Here are the things I have done to try to disable caching:
sudo hdparm -W 0 /dev/sdb
where /dev/sdb
with open
in the Python script to 0 (no Python write cache)Is it basically an impossible task to make sure that my writes get put on the disk in sequential order? All I need is write #(n) to happen before write #(n+1), and #(n+1) before #(n+2), etc.
This is the Python script I'm using to write to disk (SIZE and PRIME change based on the size of the disk an a random seed):
from struct import pack, unpack
import sys
SIZE,PRIME = [x],[x]
# random I/O traversal iterator
def rand_index_generator(a,b):
ctr=0
while True:
yield (ctr%b)
ctr+=a
with open('/dev/sdb', 'rb+', buffering=0) as f:
index_gen = rand_index_generator(PRIME, SIZE)
# random traversal using iterator above, write counter to file
for counter in xrange(1, SIZE-16):
f.seek(index_gen.next()*4)
f.write(pack('>I', counter))
Then to validate I traverse in the same order and watch for gaps of unwritten data. This is after reverting the VM back to the snapshot. I know all the traversal and writing things work since validation will work smoothly with no missed writes before reverting, but I think some "written" data dies in RAM and doesn't make it to disk
Will take any suggestions to guarantee the write order I need for this application
Found out the answer to this question. I misunderstood the effect of writing to a raw disk, it did not eliminate OS caching since I was still calling the OS to write to my raw disk. Oops
To bypass OS caches you should use os.open
and pass os.O_DIRECT
and os.O_SYNC
flags to make sure writes happen in the correct sequence (more info on those flags) and are not stuck in volatile memory. I used mmap
and os file descriptors but you could also use the normal filehandles like this
Page size is specific to your operating system. For Linux it is 4096
The top section of the code stayed the same but here is the write loop:
PAGESIZE = 4096
filedesc = os.open('/dev/sdb', os.O_DIRECT|os.O_SYNC|os.O_RDWR)
for counter in xrange(1, SIZE-16):
write_loc = index_gen.next()*4
page_dist = (write_loc%PAGESIZE)
offset = write_loc - page_dist
bytemap = mmap.mmap(filedesc, PAGESIZE, offset=offset)
bytemap[page_dist:(page_dist+4)] = pack('>I', counter)
bytemap.flush()
bytemap.close()