I'm trying to understand how OS does the swapping between the disk and RAM when a page fault occurs. For instance, assume the page table of a process is full and a swap needs to happen.
Does the frame to which the page entry in the page table point to get written to disk itself, then the frame gets overwritten by the new data that is requested? Or does the page itself contain the frame data in this case?
Additionally, if two virtual addresses can map to the same physical address (assuming it's not because of shared memory), is whatever data that was written in the frame the page they belong to points to get written to disk itself?
My confusion comes from the fact that the book I'm reading (and many resources online) mention that the 'page' gets written to disk, but the page from what I understand only contains metadata regarding the frame and the address and not the memory data themselves. So, how does it work, exactly?
Does a page table entry only contain metadata?
Yes.
In general there are virtual pages (what software sees), physical pages (actual pages of RAM that hardware sees), and metadata (page table entries) that describe the relationship between virtual pages and physical pages.
Assume that there's 90 virtual pages (of data being used by programs) and 100 physical pages (of actual RAM), where:
virtual page #0 = physical page #38
virtual page #1 = physical page #22
virtual page #2 = physical page #41
...
virtual page #89 = physical page #12
Note that this mapping is (a crudely simplified version of) metadata stored in page table entries.
Now assume software allocates a new page (so that there would be 91 virtual pages being used by programs); and the OS decides there isn't enough free physical pages of RAM (because the OS needs to ensure there's a little free physical memory for its own use) so it decides to send a page to swap space. The result might be:
virtual page #0 = physical page #38
virtual page #1 = NOT PRESENT (sent to swap space)
virtual page #2 = physical page #41
...
virtual page #89 = physical page #12
virtual page #90 = physical page #73
Now assume a program tries to use data in virtual page #1. The CPU can't figure out where the data is (because the metadata in the page table says the page isn't present); so it informs the OS. The OS determines what happened (if it's a software bug or ...) and decides to store a different virtual page's data in swap space so it can re-use that physical page to load the data for virtual page #1 into the same physical page. The result might be:
virtual page #0 = physical page #38
virtual page #1 = physical page #41
virtual page #2 = NOT PRESENT (sent to swap space)
...
virtual page #89 = physical page #12
virtual page #90 = physical page #73
After this the software that wanted data from virtual page #1 can continue (because the CPU can find the data now that the OS changed the metadata).