Search code examples
javacachingbenchmarkingcaliperdisk-io

How to measure file read speed without caching?


My java program spends most time by reading some files and I want to optimize it, e.g., by using concurrency, prefetching, memory mapped files, or whatever.

Optimizing without benchmarking is a non-sense, so I benchmark. However, during the benchmark the whole file content gets cached in RAM, unlike in the real run. Thus the run-times of the benchmark are much smaller and most probably unrelated to the reality.

I'd need to somehow tell the OS (Linux) not to cache the file content, or better to wipe out the cache before each benchmark run. Or maybe consume most of the available RAM (32 GB), so that only a tiny fraction of the file content fits in. How to do it?

I'm using caliper for benchmarking, but in this case I don't think its necessary (it's by no means a microbenchmark) and I'm not sure it's a good idea.


Solution

  • Clear the Linux file cache

    sync && echo 1 > /proc/sys/vm/drop_caches
    

    Create a large file that uses all your RAM

    dd if=/dev/zero of=dummyfile bs=1024 count=LARGE_NUMBER
    

    (don't forget to remove dummyfile when done).