Search code examples
kernelx86-64pagingosdevuserspace

x86-64 User Mode access to higher half addresses


I've been studying how paging and access controls work on x86-64 and I’m trying to understand the interaction between the USER flag used for page entries and memory access from user-mode processes.

As far as I understand, setting the USER flag in a page table entry allows a page to be accessible from user-mode (ring 3). However, my question is:

If a page in the higher half of the address space (typically used by the kernel) is mapped with the USER flag set in the PML4, PML3, PML2, and PT entries, does that mean a user-mode process can access this address, or is there additional enforcement at the CPU level (e.g., canonical address restrictions) that prevents user-mode code from accessing this region, even if it’s marked as USER?


Solution

  • A high-half page with the USER bit set is accessible by user-space.
    This was commonly done in practice in 32-bit kernels, for example the usual config for Linux was a 3:1 user:kernel split to allow a single user-space process to have up to 3 GiB of virtual address-space.

    High-half-kernel is just a software convention, not baked into the ISA.

    x86-64 Linux maps the [vsyscall] pages into the same high address of all user-space processes; this mechanism was obsoleted by vDSO which uses pages at random addresses near the top of the low half (0x00007ff...) allowing them to run clock_gettime and getpid purely in user-space without a slow syscall instruction to enter the kernel. On my desktop, cat /proc/self/maps shows the [vdso] pages near the top of the low half, e.g. 7ffc149f4000 in one run, but the [vsyscall] at ffffffffff600000-ffffffffff601000. (What are vdso and vsyscall?). Apparently the [vsyscall] page still works for backwards compatibility, so we have a real-world example of user-space running code at a high-half address on x86-64.


    The one case I know of where x86-64 assumes MSB-set = kernel is in Intel's LAM (Linear Address Masking), an optional feature that can speed up tagged pointers by making the hardware ignore some of the high bits. It's enableable for user-space addresses, kernel addresses, or both, for your choice of 48 or 57-bit addresses for either. The top bit has to match bit #47 or bit #56, leaving 6 or 15 bits for other uses. (Using the extra 16 bits in 64-bit pointers / https://lwn.net/Articles/902094/).

    AMD's UAI (Upper Address Ignore) ignores the top 7 bits; I haven't looked into it as much so IDK if it has kernel vs. user differentiation in hardware based on sign-extending the low 57 bits.

    ARM TBI (Top Byte Ignore) is also similar, but again IDK if it has user vs. kernel classification of addresses according to the high bit. (Or the highest bit that isn't ignored.)