Search code examples
linuxkernelcpu-architecturemmu

Linux Page Table Management and MMU


I have a question about relationship between linux kernel and MMU. I now got a point that the linux kernel manages page table between virtual memory addresses and physical memory addresses. At the same time there is MMU in x86 architecture which manages page table between virtual memory addresses and physical memory addresses. If MMU presents near CPU, does kernel still need to take care of page table?

This question may be stupid, but the other question is, if MMU takes care of memory space, who manages high memory and low memory? I believe kernel will receive size of virtual memory from MMU (4GB in 32bit) then kernel will distinguish between userspace and kernel space in virtual address. Am I correct? or completely wrong?

Thanks a lot in advance!


Solution

  • The OS and MMU page management responsibilities are 2 sides of the same mechanism, that lives on the boundary between architecture and micro-architecture.

    The first side defines the "contract" between the hardware and the software that runs over it (in this case - the OS) - if you want to use virtual memory, you need build and maintain a page table as described in that contract. The MMU side, on the other hand, is a hardware unit that's responsible for performing the HW tasks of the address translation. This may or may not include hardware optimizations, these are usually hidden and may be implemented in various ways to run under the hood, as long as it maintains the hardware side of the contract.

    In theory, the MMU may decide to issue a set of memory accesses for each translation (a page walk), in order to achieve the required behavior. However, since it's a performance critical element, most MMUs optimize this by caching the results of previous page walks inside the TLB, just like a cache stores the results of previous accesses (actually, on some implementations, the caches themselves may also store some of the accesses to the page table since it usually resides in cacheable memory). The MMU can manage multiple TLBs (most implementations separate the ones for data and code pages, and some have 2nd level TLBs), and provide the translation from there without you noticing that except for the faster access time.

    It should also be noted that the hardware must guard against many corner cases that can harm the coherency of such TLB "caching" of previous translations, for example page aliasing or remaps during usage. On some machines, the nastier cases even require a massive flush flow called TLB shootdown.