I am programming a hobby OS in ARM (targeting Raspberry Pi 3b+) under the AArch64 model.
Early during the boot process, we are setting up the stack pointer in preparation of jumping to our kernel_main
function. The execution seems to go fine but when debugging with GDB under QEMU, the debugger skips a whole bunch of assembly code as soon as I load an address into the sp
register.
I can see that the code that was skipped was successfully executed, the issue is just that the debugger is skipping it and it's driving me crazy since I need to debug some stuff in the area.
Sample code here, the debugger goes line by line until the set_stack
label's second instruction. As soon as the mov sp, x1
is executed the debugger skips through to the kernel_main
function, on which I have a breakpoint.
_start:
// read cpu id, stop slave cores
mrs x1, mpidr_el1
and x1, x1, #3
cbz x1, 2f
// cpu id > 0, stop
1: wfe
b 1b
2: // cpu id == 0
set_stack:
// set top of stack just before our code (stack grows to a lower address per AAPCS64)
ldr x1, =_start // _start is set at address 0x80000 by the linker
mov sp, x1 // <-- Problematic code here. This loads address 0x80000 into SP
// clear bss
ldr x1, =__bss_start
ldr w2, =__bss_size
3: cbz w2, 4f
str xzr, [x1], #8
sub w2, w2, #1
cbnz w2, 3b
// jump to C code, should not return
4: bl kernel_main
// for failsafe, halt this core too
b 1b
kernel_main(uint64_t dtb_ptr32, uint64_t x1, uint64_t x2, uint64_t x3)
{
uart_init(3);
uart_puts("Hello, kernel World!\r\n");
while (1)
uart_putc(uart_getc());
}
Code is loosely taken from https://github.com/bztsrc/raspi3-tutorial.
What's interesting is that, if I do this GDB command set $sp=0x80000
to write the same address into the SP register, the mov sp
instruction will be stepped through correctly by GDB and I can continue on the following instructions as expected.
Setup:
aarch64-elf-gcc
aarch64-elf-as
qemu-system-aarch64 -M raspi3b -kernel $(OUTDIR)myos.img -nographic -s -S -d int
My current workaround is to have a label right after the instruction and have it trigger a breakpoint.
Would still like to know if anyone have had a similar experience or might know what is happening ? Maybe a bug on QEMU or GDB ?
If you are using step
or next
to move through this assembler code then you should consider using stepi
or nexti
instead.
The step
and next
commands try to be "smarter" about how they operate, they make use of the debug information line table to ensure that GDB step by complete source lines. These commands also track which frame the command starts in, and use this to try and avoid stopping in a callee frame.
However, this frame detection doesn't always handle unexpected changes to the stack pointer, and it would appear that in this case, GDB is getting very confused about what's going on.
Switching to stepi
should solve these problems. This command is much more basic, it does no frame detection, or line table analysis, it just asks the inferior to perform a single instruction step.
The nexti
is a bit of a mix. It does generally just single step, but it also does use frame analysis to work out when it has entered a callee function. If the stack pointer is adjusted then I suspect you might still see issues with nexti
.
In general the step
and next
commands should work just fine, its only when you start doing things like adjusting the stack-pointer, or performing a context switch, that GDB will get confused and your are better switching to stepi
.