Why Linux5.15 Arm64 `cpu_context_switch` set `sp_el0` to next task_struct base

I read the code below at arch/arm64/kernel/entry.S in Linux5.15

/*
 * Register switch for AArch64. The callee-saved registers need to be saved
 * and restored. On entry:
 *   x0 = previous task_struct (must be preserved across the switch)
 *   x1 = next task_struct
 * Previous and next are guaranteed not to be the same.
 *
 */
SYM_FUNC_START(cpu_switch_to)
    mov x10, #THREAD_CPU_CONTEXT // x10 = offsetof(struct task_struct, thread.cpu_context)
    add x8, x0, x10 // x8 = previous cpu_context address
    mov x9, sp 
    stp x19, x20, [x8], #16     // store callee-saved registers
    stp x21, x22, [x8], #16     
    stp x23, x24, [x8], #16
    stp x25, x26, [x8], #16
    stp x27, x28, [x8], #16
    stp x29, x9, [x8], #16  
    str lr, [x8]
    add x8, x1, x10 
    ldp x19, x20, [x8], #16     // restore callee-saved registers
    ldp x21, x22, [x8], #16
    ldp x23, x24, [x8], #16
    ldp x25, x26, [x8], #16
    ldp x27, x28, [x8], #16
    ldp x29, x9, [x8], #16 
    ldr lr, [x8] 
    mov sp, x9                     // <============================= confused
    msr sp_el0, x1                 // <============================= confused
    ptrauth_keys_install_kernel x1, x8, x9, x10
    scs_save x0 // save the scs_sp
    scs_load_current
    ret
SYM_FUNC_END(cpu_switch_to)
NOKPROBE(cpu_switch_to)

I feel confused about the assignments for sp and sp_el0(I ponited them in the code). In my mind, sp references to sp_el1 when in kernel mode, so the code assigns a kenrel sp to sp_el1 and a task_struct base address to sp_el0? I think when process switch happened, the sp_el0 should be restored to the correcet user space stack top address, not a struct base ptr.

I know this is a design that use sp_el0 to store the base of current task_struct base. But I think it happened just in the kernel mode. When cpu_switch_to happenes, the cpu will changed to user mode soonly. The sp_el0 need to be restored to the correct user stack top.

Solution

cpu_switch_to is a kernel function, so all the registers during its operation are "normal kernel".

The idea with the function is that the processor begins executing the function while 'wearing' the context of a specific thread, but then returns 'wearing' the context of a different thread.

Every thread that is "waiting" has a kstack inside the this function, where the last time that the thread was executing "stopped".

Regarding you first 'confused' line:

you can see that

mov sp, x9                     // <============================= confused

simply loads x9 into sp - which came from

ldp x29, x9, [x8], #16

which came from the previous function call from -

mov x9, sp 
// ...
stp x29, x9, [x8], #16

sp is not special in any way, it's just another register.

Regarding your second 'confused' line:

current points to your currently executing struct task_struct.

Under kernel 5.15, it is implemented in the following way:

static __always_inline struct task_struct *get_current(void)
{
    unsigned long sp_el0;

    asm ("mrs %0, sp_el0" : "=r" (sp_el0));

    return (struct task_struct *)sp_el0;
}

#define current get_current()

As you can see, it is vital to maintain that sp_el0 keeps pointing to the current struct task_struct after a context switch.