Search code examples
linuxcachinglinux-kernelfilesystemspage-caching

Are file reads served from dirtied pages in the page cache?


When bytes are written to a file, the kernel does not immediately write those bytes to disk but stores the bytes in dirtied pages in the page cache (write-back caching).

The question is if a file read is issued before the dirty pages are flushed to disk, will the bytes be served from the dirtied pages in the cache or will the dirty pages first be flushed to disk followed by a disk read to serve the bytes (storing them in the page cache in the process)?


Solution

  • The file read will fetch data from page cache without writing to disk. From Linux Kernel Development 3rd Edition by Robert Love:

    Whenever the kernel begins a read operation—for example, when a process issues the read() system call—it first checks if the requisite data is in the page cache. If it is, the kernel can forgo accessing the disk and read the data directly out of RAM.This is called a cache hit. If the data is not in the cache, called a cache miss, the kernel must schedule block I/O operations to read the data off the disk.

    Writeback to disk happens periodically, separate from read:

    The third strategy, employed by Linux, is called write-back. In a write-back cache, processes perform write operations directly into the page cache.The backing store is not immediately or directly updated. Instead, the written-to pages in the page cache are marked as dirty and are added to a dirty list. Periodically, pages in the dirty list are written back to disk in a process called writeback, bringing the on-disk copy in line with the inmemory cache.