Search code examples
cassemblyx86-64instruction-set

x86 LEA instruction doing ambiguous things


Here's the C code:

int baz(int a, int b)
{
    return a * 11;
}

That is compiled to the following set of assembly instructions (with -O2 flag):

baz(int, int):
        lea     eax, [rdi+rdi*4]
        lea     eax, [rdi+rax*2]
        ret

The lea instruction computes the effective address of the second operand (the source operand) and stores it in the first operand. To me, it seems that the first instruction should load an address to the EAX register, but, if so, multiplying RAX by 2 does not make sense in the second lea instruction, so I infer that these two lea instructions do not do quite the same thing.

I was wondering if someone could clarify what exactly is happening here.


Solution

  • Linux uses the System V AMD64 ABI calling convention which passes the first integer parameter in the register RDI and the return value in RAX. Here EAX is sufficient, because it returns a 32-bit value. The second parameter is unused.

    LEA was intended for address calculations first on 8086 processors, but is also used for integer arithmetic with a constant factor, which is the case here. The constant factor is encoded using the scale value of the SIB byte in the instruction encoding. It can be 1,2,4 or 8.

    So, the code could be explained by

    baz(RDI, RSI):            ; a, b
    lea     eax, [rdi+rdi*4]  ; RAX = 1*a + 4*a   = 5*a
    lea     eax, [rdi+rax*2]  ; RAX = 1*a + 2*RAX = 1*a + 2*(5*a)
    ret                       ; return RAX/EAX = 11*a
    

    The upper half of RAX(64-bit value) is automatically cleared by the first LEA, see this SO question.