Search code examples
operating-systemvirtual-memorypage-tables

How does the OS update the appropriate page table when it evicts a victim page?


In an OS that uses virtual memory, each process has a page table. Each page table maps a process's virtual memory pages to the system's physical memory pages and indicates whether a given page is currently valid (loaded in memory) or not.

Suppose that memory is running low and the OS needs to choose a page to evict from physical memory. There are different algorithms for this. For example, FIFO, LRU. Once the OS chooses a page to evict, how does it invalidate any existing references to the page?

If the victim page is currently used by the active process, then the OS must invalidate the mapping in the current process's page table. If the victim page is currently used by another process, the OS must invalidate the mapping in the other process's page table. In any case, how does the OS determine which page table to update (if any), and how does it know where the mapping lives in that page table without doing a linear search?

There is a detailed description of x86 page table structure beginning at slide 22 of this presentation:

http://www.scs.stanford.edu/12au-cs140/notes/l8.pdf

I also found some helpful overviews of virtual memory:

http://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/9_VirtualMemory.html http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/virtual.html

Similar Stack Overflow question without an answer:

how does linux update page table after context switch


Solution

  • Actually, the thing you are asking about called reverse mapping. For example, in Linux, you can find usefull function try_to_unmap_anon Inside page descriptor there is a field called mapping. This field is anon_vma for annonymous pages. As you can see this is not just ordinary struct, but also a list entry. There might be several anon_vmas for one page (see try_to_unmap_anon):

    list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
    

    exactly one per page mapping. All this vmas linked into list. That is how kernel knows which processes (and their page tables) are in play. Now about how kernel determines the virtual address ... again the answer could be found here: vma_address

    233         pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
    234         unsigned long address;
    235 
    236         address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
    

    So we can answer your question shortly now: in order not to do the page tables scan, kernel stores everything it needs (for quick fetch) in page descriptor (struct page).