I'm running a code sample from a book on a Mac/aarch64, and getting a compilation error.
fn dereference(ptr: *const usize) -> usize {
let mut res: usize;
unsafe {
asm!("mov {0}, [{1}]", out(reg) res, in(reg) ptr)
};
res
}
I get:
error: expected compatible register or logical immediate
--> src/main.rs:20:15
|
20 | asm!("mov {0}, [{1}]", out(reg) res, in(reg) ptr)
| ^
|
note: instantiated into assembly here
--> <inline asm>:1:10
|
1 | mov x8, [x0]
| ^
Having searched for the error message, I believe the architecture has something to do with it, but I haven't been able to figure out how to fix it yet. I'm looking at the Rust inline assembly doc, but as the topic is completely new to me, I can't map my example that is failing to compile to what I'm seeing in the doc.
mov x8, [x0]
looks like x86-64 syntax with AArch64 register names. (From your answer, it looks like you just found an example that maybe failed to mention it was for x86-64, and tried compiling it for AArch64.)
AArch64 is a load/store architecture: special instructions (like ldr
and str
) are the only one that access memory, e.g. ldr x8, [x0]
. Look at compiler output for a Rust function that derefs a pointer, e.g. Godbolt shows a pure-Rust dereference compiling for aarch64-apple-darwin to ldr x0, [x0]
and ret
.
By contrast, x86-64 uses mov
for loads, stores, and register copies: most x86-64 instructions that read and write registers are also available with a memory source or destination. (This is one of the CISC vs. RISC differences in the designs of the two architectures.)
Read a tutorial on AArch64 asm, e.g. https://modexp.wordpress.com/2018/10/30/arm64-assembly looks reasonable; google found it when I searched for inline assembly x86-64 aarch64 ldr
. It has a short section comparing x86 vs. AArch64 assembly, but that's just a table of some equivalents for reg, reg instructions. The AArch64 equivalent for movzx eax, byte [rdi]
is ldrb w0, [x0]
, vs. movzx eax, dil
being uxtb w0, w0
, again with special instructions for zero-extending or sign-extending narrow loads.
Really they're quite different architectures. The way a lot of things are done is quite different, e.g. instructions don't set FLAGS by default, you need an S suffix on the mnemonic. (Except for cmp
and tst
which exist only to write flags.) And instead of push/pop, there's stp
/ ldp
with post-increment addressing. (Always done as a pair of 64-bit regs to keep the stack aligned by 16.)