Search code examples
linuxoperating-systemvirtual-memoryswapfile

How to find the unique swap page by virtual address when page fault


For example, if there are 3 processes, each using the virtual address 0x400000 for text section. And there is only one 4KB physical page for user process.

Suppose process 0 is using the physical page (virtual address 0x400000). Assume that the physical page data is page_pid_0_0x400000.

When process 1 is scheduled by the OS, and page_pid_1_0x400000 of process 1 would be loaded into physical page from executable. Then page_pid_0_0x400000 data should be swapped out to disk.

When process 2 is also loaded, the page_pid_2_0x400000 data on physical page should also be swapped out to disk.

Now, on disk, we have 2 copies of the same virtual address space, i.e. 0x400000: page_pid_1_0x400000 and page_pid_0_0x400000.

If process 1 is scheduled now, how can I (OS) identify the page_pid_1_0x400000 from virtual address 0x400000 (since memory accessing instructions only know the virtual address 0x400000 but not process id)?


Solution

  • The operating system can have all sorts of associated data structures. For example, each process can have its own data structures (and page tables) representing its address space, and the operating system just has to make sure to point the cpu at the correct set of page tables when it resumes the process.

    Similarly, the swap handling isn't constrained to just use a virtual address, it can use (address space, virtual address) to uncover the swap location. It can make this as flexible or rigid as need be. For example it might consider a virtual address to be part of a contiguous collection of virtual addresses which have some commonality between where the pages are stored in files or swap.

    The page tables, and notion of virtual address, are an interface to the CPU+MMU translation of address. The operating system can maintain whichever associated data structures it wants.

    In older systems, each page descriptor (sometimes page table entry or pte) would have a bit which determined if the page was valid. The CPU/MMU would ignore pages which were not considered valid; thus when a page was swapped to disk, the other bits in the page table entry are a handy place to store the disk swap address.

    Modern systems tend to have more complex data structures to accommodate transient sharing and locking of pages, so often an auxiliary structure is used.