windows linux operating-system virtual-memory pagefile

How can I get read-ahead bytes?

Operating systems read from disk more than what a program actually requests, because a program is likely to need nearby information in the future. In my application, when I fetch an item from disk, I would like to show an interval of information around the element. There's a trade off between how much information I request and show, and speed. However, since the OS already reads more than what I requested, accessing these bytes already in memory is free. What API can I use to find out what's in the OS caches?

Alternatively, I could use memory mapped files. In that case, the problem reduces to finding out whether a page is swapped to disk or not. Can this be done in any common OS?

Solution

You can indeed use your second method, at least on Linux. mmap() the file, then use the mincore() function to determine which pages are resident. From the man page:

int mincore(void *addr, size_t length, unsigned char *vec);

mincore() returns a vector that indicates whether pages of the calling process's virtual memory are resident in core (RAM), and so will not cause a disk access (page fault) if referenced. The kernel returns residency information about the pages starting at the address addr, and continuing for length bytes.

There's of course a race condition here - mincore() can tell you that a page is resident, but it might then be swapped out just before you access it. C'est la vie.