Search code examples
pythonlinuxcachingiocentos

Randomized-offset binary raw disk writes with no caching whatsoever


For my application, I am attempting to determine whether a data backup system missed any writes. I am doing this by writing an incrementing integer counter to a 1GB virtual disk, and to make sure no writes were missed I can look at the reverted snapshot and see if there were any gaps (i.e. if I see 1, 2, 3, 0, 0, 6, 7 I know that the backup didn't get writes 4 and 5 correctly). This is all on a CentOS 7 VM, with mostly Python 2.7 scripts for writes/reads (speed isn't a huge concern)

A big part of my issues has been caching: since I'm simulating random I/O, writes are often flushed from caches and written to disk out of order. This makes every test appear as a false positive, since it looks like some data is missing at the time of the snapshot. Again, I don't really care about efficiency at all, so I don't mind really slow writes. Reads can use caching, that's not a problem, but also doesn't matter much one way or the other

Here are the things I have done to try to disable caching:

  1. disable the disk write cache with sudo hdparm -W 0 /dev/sdb where /dev/sdb
  2. writing to a raw disk with no filesystem, so no filesystem caching
  3. set the buffering flag on with open in the Python script to 0 (no Python write cache)

Is it basically an impossible task to make sure that my writes get put on the disk in sequential order? All I need is write #(n) to happen before write #(n+1), and #(n+1) before #(n+2), etc.

This is the Python script I'm using to write to disk (SIZE and PRIME change based on the size of the disk an a random seed):

from struct import pack, unpack
import sys
SIZE,PRIME = [x],[x]
# random I/O traversal iterator
def rand_index_generator(a,b):
    ctr=0
    while True:
        yield (ctr%b)
        ctr+=a

with open('/dev/sdb', 'rb+', buffering=0) as f:
    index_gen = rand_index_generator(PRIME, SIZE)
    # random traversal using iterator above, write counter to file
    for counter in xrange(1, SIZE-16):
        f.seek(index_gen.next()*4)
        f.write(pack('>I', counter))

Then to validate I traverse in the same order and watch for gaps of unwritten data. This is after reverting the VM back to the snapshot. I know all the traversal and writing things work since validation will work smoothly with no missed writes before reverting, but I think some "written" data dies in RAM and doesn't make it to disk

Will take any suggestions to guarantee the write order I need for this application


Solution

  • Found out the answer to this question. I misunderstood the effect of writing to a raw disk, it did not eliminate OS caching since I was still calling the OS to write to my raw disk. Oops

    To bypass OS caches you should use os.open and pass os.O_DIRECT and os.O_SYNC flags to make sure writes happen in the correct sequence (more info on those flags) and are not stuck in volatile memory. I used mmap and os file descriptors but you could also use the normal filehandles like this

    Page size is specific to your operating system. For Linux it is 4096

    The top section of the code stayed the same but here is the write loop:

    PAGESIZE = 4096
    filedesc = os.open('/dev/sdb', os.O_DIRECT|os.O_SYNC|os.O_RDWR)
    for counter in xrange(1, SIZE-16):
        write_loc = index_gen.next()*4
        page_dist = (write_loc%PAGESIZE)
        offset = write_loc - page_dist
        bytemap = mmap.mmap(filedesc, PAGESIZE, offset=offset)
        bytemap[page_dist:(page_dist+4)] = pack('>I', counter)
        bytemap.flush()
        bytemap.close()