Search code examples
assemblyx86terminologyattaddressing-mode

Confusion about addressing modes - how does a register by itself outside () work as an ADDRESS_OR_OFFSET constant?


In Programming from the Ground Up, in chapter 3 I read

The general form of memory address references is this:

ADDRESS_OR_OFFSET(%BASE_OR_OFFSET, %INDEX, MULTIPLIER)

All fields are optional. To calculate the address, simply perform the following calculation:

FINAL ADDRESS = ADDRESS_OR_OFFSET + %BASE_OR_OFFSET + MULTIPLIER * %INDEX

ADDRESS_OR_OFFSET and MULTIPLIER must both be constants, while the other two must be registers. If one of the pieces is left out, it is just substituted with zero in the equation.

Now, I assume that substituted with zero is a typo, because if MULTIPLIER's default was 0, then the value of %INDEX would be irrelevant, as the product would always be zero anyway (indeed). I guess 0 is default for the other 3?

Nonetheless, what confuses me the most is that form the description above I understand that parenthesis and commas have the function of determining which parts of what we write map to the 4 "operands" of the addressing.

But then, in the following chapter I read

For example, the following code moves whatever is at the top of the stack into %eax:

movl (%esp), %eax

If we were to just do

movl %esp, %eax

%eax would just hold the pointer to the top of the stack rather than the value at the top.

But I don't understand why. I mean,

  • given the FINAL ADDRESS expression above, I would say that

    • if we put %esp in parenthesis, it will play the role of %BASE_OR_OFFSET, with ADDRESS_OR_OFFSET and %INDEX defaulting to 0 and MULTIPLIER to 1,
    • if we put %esp not in parenthesis, it will play the role of ADDRESS_OR_OFFSET, with %BASE_OR_OFFSET and %INDEX defaulting to 0 and MULTIPLIER to 1,

    and the sum would still be the same.

  • Furthermore, how is %esp constant?

    • Maybe I'm making the mistake of thinking that it is not constant because I think about the content of %esp?
    • If that's the case, and %esp is constant becasue is the name of a physically fixed register, then what is a non constant, in this context?

Solution

  • Correct, the default multiplier is 1.


    movl %esp, %eax isn't using a memory addressing-mode at all. It's a register-direct operand, so it's syntactically different from mov symbol_name, %eax (a load from an absolute address).

    There's a register but it's not inside () so the disp(base,idx,scale) syntax doesn't apply.

    In machine code, the ModRM byte's 2-bit "mode" field uses 0b11 to encode that it's a register operand instead of memory. (The other 3 encodings select memory with no displacement vs. disp8 vs. disp32: https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes. And see also rbp not allowed as SIB base? for the fun special cases that allow disp32 with no registers, and to make the SIB byte optional to save machine-code size for simple addressing modes.) With ModR/M.mode = 11, the field is just a simple register number. Similarly in assembly language, when you use a bare register name, you just get the register operand directly, not using it as an address to access memory.

    (I'm not sure this is a useful analogy, but I think the useful point is that a register operand is a different thing from a memory operand even in x86 machine code. They are qualitatively different and need to be distinguished.)


    Also related: