Search code examples
assemblyarm64mach-o

Why is data loaded bigger than expected using ADRP, ADD, and LDR instructions?


I am learning assembly for Apple M1 arm64 and noticed something odd on trying to load a data from a label on the data segment. It seems to be loading the content not just of the label I asked for var1 but also the next label close to it var2.

.global _start

_start:
    mov x0, #0
    mov x1, #0

    // In this first part we load data from memory.
    // -------------

    // Load var1 from the memory, and set its address to X0 register.
    adrp X0, var1@PAGE      // address of var1 word 4k page
    add X0, X0, var1@PAGEOFF    // offset to var1 within the page

    // Get the value from the address memory stored on X0 register.
    // When using `ldr` with brackets loads the actual value
    // similar to dereferencing a pointer.
    ldr X1, [X0]

    // In this second part we update data from memory.
    // -------------

    // Set 3 to the X2 register.
    mov X2, #3     // <--- This is the line 38 where I add a breakpoint

    // Load the memory address for the vaX2 data variable.
    adrp X3, var2@PAGE
    add X3, X3, var2@PAGEOFF

    // Store X2 register value to the memory address on the X3 register.
    str X2, [X3]

    // Exit program
    mov X0, 0       // 0 status code
    mov X16, #1
    svc #0x80

.data
var1: .word 5
var2: .word 6

This is how I build it:

as lesson03.s -g -o lesson03.o
ld lesson03.o -o lesson03 -l System -syslibroot `xcrun -sdk macosx --show-sdk-path` -e _start -arch arm64

And this is how I am debugging the value for X1:

lldb lesson03

(lldb) b lesson03.s:38

Breakpoint 1: where = lesson03`start + 20, address = 0x0000000100003f8c
(lldb) run

(lldb) re read x1
      x1 = 0x0000000600000005

I was expecting to get 0x0000000000000005 (decimal 5), but as we can see it was 0x0000000600000005 (decimal 25769803781), which seems to be loading the data from <var2=00000006,var1=00000005> together.

Does anyone know why that happens and how to fix it? By the way, I am looking for a solution where I can still load data from the .data segment.


Solution

  • To mark this question as resolved, let me paraphrase Jester and Peter, who answered my question in the comments section.

    The .word data directive holds 32-bit long data and I was loading that value into a x register which is 64-bit long. Therefore more data was being loaded into that register to completely fill it it up. To solve this issue I should either:

    • Use .quad (64-bit) so both data and register match the same size.
    • Or, still use .word (32-bit) but load the value to a w register which is 32-bit long as well.

    References: