Search code examples
processlinux-kerneloperating-systemsystem-callsinterrupt

Does a system call involve a context switch or not?


I am reading the wikipedia page on system calls and I cannot reconcile a few of the statements that are made there.

At the bottom, it says that "A system call does not generally require a context switch to another process; instead, it is executed in the context of whichever process invoked it."

Yet, at the top, it says that "[...] applications to request services via system calls, which are often initiated via interrupts. An interrupt [...] passes control to the kernel [and then] the kernel executes a specific set of instructions over which the calling program has no direct control".

It seems to me that if the interrupt "passes control to the kernel," that means that the kernel, which is "another process," is executing and therefore a context switch happened. Therefore, there seems to be a contradiction in the wikipedia page. Where is my understanding wrong?


Solution

  • Your understanding is wrong because the kernel isn't a separate process. The kernel is sitting in RAM in shared memory areas. Typically, it sits in the top half of the virtual address space.

    When the kernel is invoked with a system call, it is not necessarily using an interrupt. On x86-64, it is invoked directly using a specific processor instruction (syscall). This instruction makes the processor jump to the address stored in a special register.

    Syscalls don't necessarily involve a full context switch. They must involve a user mode to kernel mode context switch. Most often, kernels have a kernel stack per process. This stack is mostly unused and empty when no system call is active as it then makes no sense to have anything stored in it.

    The registers also need to be saved since the kernel can use them. I don't know for other processors but x86-64 does have the TSS allowing for automated user mode to kernel mode stack switch. The registers still need to be saved manually.

    In the end, there is actually a necessary partial context switch when entering the kernel through a system call but it doesn't involve switching the whole process. Since the temporary storage for swapped registers and the kernel stack are already reserved, it involves much less overhead as the kernel doesn't need to touch the page tables. Swapping page tables often involves cache managing and some cache flushing to make it consistent.