Quick question. This code will not compile:
mov eax, dword [rbx+rsi*5]
I don't expect it to, with the explaination that mov and multiplication are two different CPU operations. The only reason it can be achieved is through bit-shifting.
However, this does compile:
mov eax, dword [lst+rsi*5]
With "lst" being a variable array. It also produces output when used in context (so the code compiles AND runs). What's the explanation for why this works?
yasm -Worphan-labels -g dwarf2 -f elf64 NAME.asm -l NAME.lst
x86 addressing modes have to fit the form [base + idx*scale + disp0/8/32]
. (Or RIP-relative.)
The *scale
is actually encoded as a 2-bit shift count, so it can be 1, 2, 4, or 8. See A couple of questions about [base + index*scale + disp] and Referencing the contents of a memory location. (x86 addressing modes)
What's happening here is that your assembler decomposes [lst + rsi*5]
into [lst + rsi + rsi*4]
for you. (Or other scale factors of the form 1 + (1<<0..3)
)
(Where lst
is a 4-byte (32-bit) absolute address that gets sign-extended to 64-bit. And yes this works in Linux non-PIE executables; static code+data goes in the low 2GiB of virtual address space exactly so this can work.)
But if you already have a base register, there's no way to split it up and still have an encodeable addressing mode. [rbx + rsi + rsi*4]
is impossible.
Similarly, NASM and YASM let you write things like vaddps xmm0, [rbp]
instead of vaddps xmm0, xmm0, [rbp+0]
(even though RBP as a base register is not encodeable without a displacement. Also omitting the first source operand when it's the same as the destination). Or for example writing [rbp + rax]
instead of [rbp + rax*1]
- an addressing mode can only have at most 1 each base or index.
When the operation expressed by your code is unambiguous and encodeable somehow, assemblers sometimes have convenient features to let the source look like different from the machine code / what you'd get from disassembly.
mov and multiplication are two different CPU operations
Addressing modes do include addition and shifting, even though shl
and add
are also separate instructions. That's not why. Also, imul ecx, [lst + rsi + rsi*4], 12345
is a valid instruction. So is a similar shift or add with a memory source or destination operand.
But yes, x86 addressing modes can't encode arbitrary multiplications, just a 2-bit shift count.
Normally you'd get a pointer in a register and increment it inside the loop
add rsi, 5*4 ; 5*4 = 20 as an assemble time constant expression
add eax, [rsi]
This is basically a strength-reduction of the scaling that turns multiplication or shifting into addition. It means you can use simple non-indexed addressing modes which are more efficient (code-size, and avoids unlamination on Sandybridge-family.)