I have the assembly code compiling for Mac m1 (arm64 macho):
.text
.globl main
.p2align 2
main:
stp x29, x30, [sp, -16]!
add x29, sp, 0 mov x2, #6
adrp x1, _fmt@PAGE
add x1, x1, _fmt@PAGEOFF
mov x0, #1
adrp x3, _write@PAGE
add x3, x3, _write@PAGEOFF
blr x3
mov w0, #0
ldp x29, x30, [sp], #16
ret
/* end function main */
.balign 8
_fmt:
.ascii "Hello\n"
which fails with:
final section layout:
__TEXT/__text addr=0x100003F88, size=0x00000030, fileOffset=0x00003F88, type=1
__TEXT/__unwind_info addr=0x100003FB8, size=0x00000048, fileOffset=0x00003FB8, type=22
__DATA/__data addr=0x100004000, size=0x00000006, fileOffset=0x00004000, type=0
ld: ARM64 ADRP out of range (-4294979584 max is +/-4GB): from main (0x100003F88) to _write@0x00000000 (0x00000000) in 'main' from test.o for architecture arm64
I believe this happens because the _write
instruction references the write syscall which isn't available until link time which means the assembler doesn't know what to put as the address for _write
and it gets written as 0x00000000
. (Correct me if I am wrong)
I am wrong. (thanks @user3124812)
The annoying part is this doesn't happen if I call bl _write
without putting it in an register first.
For example:
.text
.globl main
.p2align 2
main:
stp x29, x30, [sp, -16]!
add x29, sp, 0 mov x2, #6
adrp x1, _fmt@PAGE
add x1, x1, _fmt@PAGEOFF
mov x0, #1
bl _write
mov w0, #0
ldp x29, x30, [sp], #16
ret
/* end function main */
.data
.balign 8
_fmt:
.ascii "Hello\n"
This works and prints Hello
followed by a newline.
Why can't I store the label location in a register before printing it? It seems to work with labels in my section
The values of symbols that are not defined in the same shared object as the code referencing them are not known. For this reason, you have to load such addresses from the global offset table (GOT) like this:
adrp x0, _write@GOTPAGE
ldr x0, [x0, _write@GOTPAGEOFF]
This loads the address of _write
into x0
. Such code can also be used for symbols defined in the same shared object as the reference, but in such cases it might be easier to just access them directly.
When you call a function directly with bl
, the linker makes the call go to the procedure linkage table (PLT) which holds a trampoline going to the actual function. Hence, direct calls work.