Search code examples
cfilestreamfseek

What is the advange of using fseek over using a sequence of fread in C?


I'm a beginner in C programming and I have some questions regarding how to deal with files.

Let us suppose that we have a binary file with N int values stored. Let us suppose that we what to read the i-th in value in the file.

Is there any real advantage of using fseek for positioning the file pointer to the i-th int value and reading it after the fseek instead of using a sequence of i fread calls?

Intuitively, I think that fseek is faster. But how the function finds the i-th value in the file without reading the intermediary information?

I think that this is implementation-dependent. So, I tried to find the implementation of fseek function, without much success.


Solution

  • But how the function finds the i-th value in the file without reading the intermediary information?

    It doesn't. It's up to you provide the correct (absolute or relative) offset. You can request, for example, to advance the file pointer by i*sizeof(X).

    It still needs to follow the chain of sectors in which the file is located to find the right one, but that doesn't require reading those sectors. That metadata is stored outside of the file itself.

    Is there any real advantage of using fseek for positioning the file pointer to the i-th int value and reading it after the fseek instead of using a sequence of i fread calls?

    There are potential benefits at every level.

    By seeking, the system may have to read less from the disk. The system reads from the disk in sectors, so short seeks might not have this benefit because of caching. But seeking over entire sectors reduces the amount of data that needs to be fetched from the disk.

    Similarly, by seeking, the stdio library my have to request less from the OS. The stdio library normally reads more than it requires so that future calls to fread doesn't need to touch the OS or the disk. A short seek might not require making any system calls, but seeking beyond the end of the buffered data could reduce the total amount of data fetched from the OS.

    Finally, the skipped data doesn't need to be copied from the stdio library's buffers to the user's buffer at all when using fseek, no matter how far you seek.

    Oh, and let's not forget that you were considering i-1 reads instead of just a large one. Each of those reads consume CPU, both in the library (error checking) and in the caller (error handling).