I have the following question regarding C file I/O.
At a physical level (harddrive), is it valid to assume that every fread(n_blocks, size, length,FILE fp)
operation should cost one random access for the first page (block) and n-1 sequential accesses for the next blocks of that same buffer ??
I assumed this because the OS has so many processes that is mostly sure that one of them is also writing to or reading from a file between each fread
of the local program and by that assumption the hard drive is positioned at another sector / cylinder.
is ok this assumption?
Whether you assume that or not, this is a gross oversimplification of the reality.
First thing first: you seem to think that the 3rd parameter (length
) corresponds to the number of some discrete 'access operations'. This is not the case. All fread
does is to read size*length
bytes; thus the following three calls do the exact same thing, as long as the multiplication doesn't overflow:
fread(n_blocks, size, length, fp);
fread(n_blocks, size*length, 1, fp);
fread(n_blocks, 1, size*length, fp);
What actually happens, is that fread/fwrite
will read and write to/from an internal buffer in the memory of your process. That buffer can be controlled with the setbuf/setvbuf
functions. When the buffer is full/empty, they will forward the read/write to the operating system, which has its own file cache. If you are reading and the OS can't find the portion of the file in the cache then your program will wait until the data is actually fetched from the drive. When writing, the data will be copied to the OS cache and reside there until the OS decides to write it to the drive, which may happen long after your program has closed the file and existed. In turn, today's hard drives have their own internal caches which the OS may not even be aware of.
For all practical purposes, you should not concern yourself with how many drive accesses each fread/fwrite
call does. Just know that C, the OS, and the hardware underneath will do their best to provide the requested data as fast as possible. However, keep in mind that this entire stack is optimized for sequential access. So avoid jumping all around the file with fseek
for no good reason.