Search code examples
operating-systemhashtablepagingpage-tables

Paging: Basic, Hierarchical, Hashed, and Inverted


With respect to operating systems and page tables, it seems there are 4 general methods to paging and page tables

Basic - A single page table which stores the page number and the offset

Hierarchical - A multi-tiered table which breaks up the virtual address into multiple parts

Hashed - A hashed page table which may often include multiple hashings mapping to the same entry

Inverted - The logical address also includes the PID, page number and offset. Then the PID is used to find the page in to the table and the number of rows down the table is added to the offset to find the physical address for main memory. (Rough, and probably terrible definition)

I am just wondering what are the pros and cons of each method? It seems like basic is the easier method but may also take up more space in memory for a larger address space. What else?


Solution

  • The key to building a usable page model is minimizing the unused space for entries that are not necessary. You want to minimize the amount of memory needed while keeping the computation cost of a memory lookup low.

    • Basic can take up a lot of memory (for a modern system using 4GB of memory, that might amount to 300 MB only for the table) and is therefore impractical.

    • Hierarchical reduces that memory a lot by only adding subtables that are actually in use. Still, every process has a root page table. And if the memory footprint of the processes is scattered, there may still be a lot of unnecessary entries in secondary tables. This is a far better solution regarding memory than Basic and introduces only a marginal computation increase.

    • Hashed does not work because of hash collisions

    • Inverted is the solution to make Hashed work. The memory use is very small (as big as a Basic table for a single process, plus some PID and chaining overhead). The problem is, if there is a hash collision (several processes use the same virtual address) you will have to follow the chain information (just as in a linked list) until you find the entry with a matching PID. This may produce a lot of computing overhead in addition to the hash computing, but will keep the memory footprint as small as possible.