Assembly - Moving through a register/array with an offset of 5

Quick question. This code will not compile:

mov eax, dword [rbx+rsi*5]

I don't expect it to, with the explaination that mov and multiplication are two different CPU operations. The only reason it can be achieved is through bit-shifting.

However, this does compile:

mov eax, dword [lst+rsi*5]

With "lst" being a variable array. It also produces output when used in context (so the code compiles AND runs). What's the explanation for why this works?

yasm -Worphan-labels -g dwarf2 -f elf64 NAME.asm -l NAME.lst

Solution

x86 addressing modes have to fit the form [base + idx*scale + disp0/8/32]. (Or RIP-relative.)

The *scale is actually encoded as a 2-bit shift count, so it can be 1, 2, 4, or 8. See A couple of questions about [base + index*scale + disp] and Referencing the contents of a memory location. (x86 addressing modes)

What's happening here is that your assembler decomposes [lst + rsi*5]
into [lst + rsi + rsi*4] for you. (Or other scale factors of the form 1 + (1<<0..3))
(Where lst is a 4-byte (32-bit) absolute address that gets sign-extended to 64-bit. And yes this works in Linux non-PIE executables; static code+data goes in the low 2GiB of virtual address space exactly so this can work.)

But if you already have a base register, there's no way to split it up and still have an encodeable addressing mode. [rbx + rsi + rsi*4] is impossible.

Similarly, NASM and YASM let you write things like vaddps xmm0, [rbp] instead of vaddps xmm0, xmm0, [rbp+0] (even though RBP as a base register is not encodeable without a displacement. Also omitting the first source operand when it's the same as the destination). Or for example writing [rbp + rax] instead of [rbp + rax*1] - an addressing mode can only have at most 1 each base or index.

When the operation expressed by your code is unambiguous and encodeable somehow, assemblers sometimes have convenient features to let the source look like different from the machine code / what you'd get from disassembly.

mov and multiplication are two different CPU operations

Addressing modes do include addition and shifting, even though shl and add are also separate instructions. That's not why. Also, imul ecx, [lst + rsi + rsi*4], 12345 is a valid instruction. So is a similar shift or add with a memory source or destination operand.

But yes, x86 addressing modes can't encode arbitrary multiplications, just a 2-bit shift count.

Looping through an array of arbitrary stride / element size:

Normally you'd get a pointer in a register and increment it inside the loop

add  rsi, 5*4      ; 5*4 = 20 as an assemble time constant expression
add  eax, [rsi]

This is basically a strength-reduction of the scaling that turns multiplication or shifting into addition. It means you can use simple non-indexed addressing modes which are more efficient (code-size, and avoids unlamination on Sandybridge-family.)