Search code examples
linux-kernelx86-64cpu-architecturesystem-callsosdev

Calling system API from 32-bit processes under Linux 64-bit


Linux 64 bit on x86_64 employs 4 segment descriptors for code and data segments for userspace and kernelspace.

AFAIK a call to system API from a 64-bit process is done executing the syscall instruction from ring 3 (usermode) code segment. Then sysret executed in kernel mode returns control to the calling 64-bit process switching back the CS segment selector to point to the 64-bit usermode code segment.

What about 32-bit processes ? From Intel SDM syscall is not supported in IA-32e compatibility mode.

P.s. I'm aware of on Windows 64 bit, 32-bit applications are supported via Wow64 subsystem and a call into system API is done switching the logical CPU/core from 32-bit compatibility mode to 64-bit mode in userspace. The call to system API is done actually from 64-bit mode and upon returning from it the logical CPU is then switched back again in compatibility mode.


Solution

  • See https://blog.packagecloud.io/the-definitive-guide-to-linux-system-calls/.

    The recommended way to make system calls from 32-bit code in Linux is to call into the VDSO, a "library" of code+data that the kernel maps into the address-space of every executable. The kernel chooses at bootup which instructions to put into it, depending on what the CPU supports.

    64-bit kernel, 64-bit user-space (64-bit mode)

    • Glibc uses syscall directly, only calling a VDSO wrapper for system calls like clock_gettime and getpid that can run purely in user-space. All x86-64 CPUs support syscall from 64-bit user-space

    64-bit kernel, 32-bit user-space (compat mode sub-mode of long mode)

    • On Intel CPUs, x86-64 Linux's 32-bit VDSO system-call wrapper uses sysenter.
      AMD only supports sysenter in legacy mode, if at all.
    • On AMD CPUs, x86-64 Linux's 32-bit VDSO system-call wrapper uses syscall. The 64-bit kernel side has similar semantics between 64-bit and compat mode user-space. Intel CPUs only support syscall in full 64-bit mode.
    • I think all x86-64 CPUs support one or the other, so the fallback to slow int 0x80 is never needed. I don't know which one is supported by CPUs from Via or Zhaoxin or other vendors.

    32-bit kernels (legacy mode)

    • (Intel and AMD): The VDSO uses sysenter if available. Intel CPUs were the first to support this. AMD added support for sysenter (in legacy mode only, not compat mode) to their CPUs some time after adding legacy-mode syscall.
    • Otherwise it uses int 0x80. Only ancient CPUs from either vendor are stuck with this.

    AMD CPUs support syscall in legacy mode with different semantics from long mode, but Linux doesn't use that even if available. According to kernel comments in entry_64_compat.S which defines the entry points from compat mode into a 64-bit kernel, Linux disables the SYSCALL instruction on 32-bit kernels because the SYSCALL instruction in legacy/native 32-bit mode (as opposed to compat mode) is sufficiently poorly designed as to be essentially unusable. This is part of a long comment on the compat-mode entry-point for syscall for AMD CPUs, which is used.

    Intel CPUs support sysenter from 64-bit user-space, but Linux never uses that, only syscall. (Intel manual: https://www.felixcloutier.com/x86/sysenter)

    The history here is that before AMD64 existed, both vendors added their own fast-system-call instructions as extensions. AMD's was apparently not well designed, which is why it has different kernel-side semantics in 64-bit mode.

    For x86-64, each vendor kept more support for their own fast-system-call instruction across modes, although AMD later added support for sysenter in legacy mode because at least some OSes (like Linux) weren't using syscall in 32-bit kernels because.


    Related

    I didn't think any of those explained clearly enough about compat mode vs. legacy mode differences, so not duplicates.