Search code examples
windowsimagematlabioramdisk

Imread & Imwrite do not achieve expected gains on a Ramdisk


I have written a particular image processing algorithm that makes heavy use of imwrite and imread. The following example will run simultaneously on eight Matlab sessions on a hyper-threading-enabled 6-core i7 machine. (Filenames are different for each session.)

tic; 
for i=1:1000 
    %a processing operation will be put here%
    imwrite(imgarray,temp,'Quality',100); 
    imgarray=imread(temp); 
end 
toc;

I'm considering temp=[ramdrive_loc temp]; change in the example code for two purposes:

  • Reducing time consumption
  • Lowering hard drive wearing

Image files created are about 1 Mb in size. Hard drives are formed as RAID0 with 2 x 7.2k Caviar Blacks. The machine is a Windows machine, in which partitions are formatted as NTFS.

The outputs of toc from above are (without processing images) :

Without Ramdisk: 104.330466 seconds.

With Ramdisk: 106.100880 seconds.

Is there anything that causes me not to gain any speed? Would changing file system of the ramdisk to FAT32 help?

Note: There were other questions regarding ramdisk vs. harddisk comparisons; however this question is mostly about imread, imwrite, and Matlab I/O.

Addition: The ram disk is set up through a free software from SoftPerfect. It has 3gb space, which is more than adequate for task (maximum of 10mb is to be generated and written over and over during Matlab sessions).


Solution

  • File caching. Probably, Windows' file cache is already speeding up your I/O activity here, so the RAM disk isn't giving you an additional speedup. When you write out the file, it's written to the file cache and then asynchronously flushed to the disk, so your Matlab code doesn't have to wait for the physical disk writes to complete. And when you immediately read the same file back in to memory, there's a high chance it's still present in the file cache, so it's served from memory instead of incurring a physical disk read.

    If that's your actual code, you're re-writing the same file over and over again, which means all the activity may be happening inside the disk cache, so you're not hitting a bottleneck with the underlying storage mechanism.

    Rewrite your test code so it looks more like your actual workload: writing to different files on each pass if that's what you'll be doing in practice, including the image processing code, and actually running multiple processes in parallel. Put it in the Matlab profiler, or add finer-grained tic/toc calls, to see how much time you're actually spending in I/O (e.g. imread and imwrite, and the parts of them that are doing file I/O). If you're doing nontrivial processing outside the I/O, you might not see significant, if any, speedup from the RAM disk because the file cache would have time to do the actual physical I/O during your other processing.

    And since you say there's a maximum of 10 MB that gets written over and over again, that's small enough that it could easily fit inside the file cache in the first place, and your actual physical I/O throughput is pretty small: if you write a file, and then overwrite its contents with new data before the file cache flushes it to disk, the OS never has to flush that first set of data all the way to disk. Your I/O might already be mostly happening in memory due to the cache so switching to a RAM disk won't help because physical I/O isn't a bottleneck.

    Modern operating systems do a lot of caching because they know scenarios like this happen. A RAM disk isn't necessarily going to be a big speedup. There's nothing specific to Matlab or imread/imwrite about this behavior; the other RAM disk questions like RAMdisk slower than disk? are still relevant.