Search code examples
debugginglinux-kernelarmarm64

(arm64) How to find from where the schedule() function was called by printk while debugging kernel? "lr" value in stack shows weird value


I'm trying to boot linux on an experimental board using busybox and vanilla linux kernel (5.10.0).
At the last stage of init_kernel, it executes the init script and the last command in the init script is exec /bin/sh. But somehow, the /bin/sh freezes and I can see it is right after the schedule() function is called and retunred. But I don't know where the program has gone after the schedule() function. (I cannot use gdb for the board). So as shown below, I tried putting some prints at the schedule function to read the lr(link register,=x30) that should remain at the bottom of the stack during the schedule function.
(In arm64 architecture, at the entrance of a function, x29(=fp) and x30(=lr) is stored at the bottom of the stack. lr register is the address to return to after the function finishes. see understanding aarch64 assembly function call, how is stack operated?.
The variable passed_it is for limiting the print to the time after invoking init script.)

-- kernel/sched/core.c --

extern int passed_it;
asmlinkage __visible void __sched schedule(void)
{
    struct task_struct *tsk = current;
    register void *sp asm ("sp");

    if (passed_it) printk("@entered schedule\n");
    sched_submit_work(tsk);
    do {
        preempt_disable();
            if (passed_it) printk("@entering __schedule\n");
        __schedule(false);
            if (passed_it) printk("@exited __schedule\n");
        sched_preempt_enable_no_resched();
    } while (need_resched());
    sched_update_worker(tsk);
    if (passed_it) printk("@exiting schedule. sp=%px, fp=%lx, lr=%lx\n",sp,*((long *)sp), *((long *)sp+8));
}

and this is the log from the experiment when it stopped.

This boot took 0.00 seconds

### calling /bin/sh ###
/bin/sh: can't a@entered schedule
@entering __schedule
@exited __schedule
@exiting schedule. sp=ffffffc0106a3f00, fp=ffffffc0106a3f40, lr=ffffffc0106a3f50

I generated vmlinux.objdump by aarch64-none-elf-objdump -S vmlinux > vmlinux.objdump to see what the lr value points to. But it was not in any text section and the System.map shows this 0xffffffc0106a3f50 is just somewhere between __start_init_task(0xffffffc0106a0000, = __init_stack) and __end_init_task(0xffffffc0106a4000). The init stack grows down from __init_stack.

-- System.map --

ffffffc01063c1c0 d cfd_data
ffffffc01063c200 d csd_data
ffffffc01063c220 D __per_cpu_end
ffffffc0106a0000 D __init_end
ffffffc0106a0000 D __initdata_end
ffffffc0106a0000 D __start_init_task
ffffffc0106a0000 D _data
ffffffc0106a0000 D _sdata
ffffffc0106a0000 D init_stack
ffffffc0106a0000 D init_thread_union
ffffffc0106a4000 D __end_init_task
ffffffc0106a4000 D __nosave_begin
ffffffc0106a4000 D __nosave_end
ffffffc0106a4000 d vdso_data_store
ffffffc0106a5000 D boot_args

So why is it pointing to __init_task ? Could any one tell me what I am missing here? On second thought, I guess the virtual address printed in the experiment might be a user space virtual address, not the linux kernel address. Will this be the case?


Solution

  • The other day I found why it was not working and forgot up update it here.
    The reason was simple. My understanding that the lr(=x30) value is stored at the second value from the current stack frame was correct. But the problem was from a simple mistake (pointer operation).
    The printk state

    if (passed_it) printk("@exiting schedule. sp=%px, fp=%lx, lr=%lx\n",sp,*((long *)sp), *((long *)sp+8));
    

    should have been

    if (passed_it) printk("@exiting schedule. sp=%px, fp=%lx, lr=%lx\n",sp,*((long *)sp), *((long *)sp+1));
    

    or

    if (passed_it) printk("@exiting schedule. sp=%px, fp=%lx, lr=%lx\n",sp,*((long *)sp), *((long *)(sp+8)));
    

    you know adding 1 to a (long *) means incrementing 8 in the address(very basic).
    After fixing this, I could follow where the program went from the schedule() function.(it went back to schedule_preempt_disabled() and the backed to rest_init(). So this path was the init task which is going to be idling at the end.).