I want to know how long it takes my program to read a 12.9MB .wav
file into memory. The function that reads a file into memory looks as follows:
import qualified Data.ByteString as BS
getSamplesFromFileAsBS :: FilePath -> IO (BS.ByteString)
It takes the name of the file and returns the samples as a ByteString
. It also performs some other validity checks on the data and ignores the header information. I read the ByteString
of samples into memory using ByteString.hGet
.
If I now benchmark this function with a 12.9MB file, using Criterion
:
bencher :: FilePath -> IO ()
bencher fp = defaultMain [
bench "Reading all the samples from a file." $ nfIO (getSamplesFromFileAsBS fp)
]
I get the following result:
benchmarking Reading all the samples from a file.
time 3.617 ms (3.520 ms .. 3.730 ms)
0.989 R² (0.981 R² .. 0.994 R²)
mean 3.760 ms (3.662 ms .. 3.875 ms)
std dev 354.0 μs (259.9 μs .. 552.5 μs)
variance introduced by outliers: 62% (severely inflated)
It seems to load 12.9MB into memory in 3.617ms. This doesn't seem realistic since it indicates that my SSD can read 3+GB/s, which is not the case at all. What am I doing wrong?
I decided to try this another (more naive) way, by manually measuring the time difference:
runBenchmarks :: FilePath -> IO ()
runBenchmarks fp = do
start <- getCurrentTime
samplesBS <- getSamplesFromFileAsBS fp
end <- samplesBS `deepseq` getCurrentTime
print (diffUTCTime end start)
This gives me the following result: 0.023105s. This is realistic because it would mean my SSD can read at a speed of around 600MB/s. What is wrong with the Criterion
result?
I looked at the visual results of my Criterion
benchmark by writing the output to a html
file. I could clearly see that the first run took around 0.020s, whereas the rest (after caching) takes around 0.003s.
So I'm getting these results because of caching.