Search code examples
assemblyx86-64reverse-engineeringintelida

What is the outcome of mov on non bracket memory locations?


I am having problems distinguishing whether the address is loaded or the content from the address. Please help me clarify.

1. mov     [rsp+78h+arg_0], rsi
2. mov     rsi, cs:qword_1F39B60
3. mov     [rsp+78h+arg_38], rsi

On line 2, is it loading 1F39B60 in rsi or the contents of 1F39B60 in rsi?

Would lea rsi, [qword_1F39B60] be the same?

If non bracket using mov action on a memory even allowed or this is just a visual IDA thing?

Can you explain to me why it shows cs: even though qword_1F39B60 is in the .data segment? Shouldn't it be ds:?

Last but not the least is rsp+78h a fancy way of saying rbp by the disassembler?


Solution

  • On line 2, is it loading 1F39B60 in rsi or the contents of 1F39B60 in rsi?

    Let's assume that qword_1F39B60 is a pseudolabel defined by your disassembler somewhere in code section and that address of this label is linked at the virtual address 0x01F39B60. This is the number which will be loaded to rsi then.

    Would lea rsi, [qword_1F39B60] be the same?

    Yes.

    Is non bracket using mov action on a memory even allowed or this is just a visual IDA thing?

    Yes, but the meaning depends on the assembler. IDA syntax is MASM-style, like the GNU assembler's Intel syntax. mov reg, name is a load just like mov reg,[name], loading the register from a memory location. See Confusing brackets in MASM32

    In many other more sane assemblers, mov reg, name loads the register with address of that symbol/label. The (relocated) address is expressed in instruction encoding as an immediate operand. (The MASM syntax for this is mov reg,OFFSET mem to load reg with the offset of symbol mem expressed as an immediate number). In 64-bit code you'll more often see a RIP-relative LEA to put an address into a register.

    Can you explain to me why it shows cs: even though qword_1F39B60 is in the .data segment? Shouldn't it be ds:?

    As Margaret Bloom explains in this answer, using prefix cs in mov rsi, cs:qword_1F39B60 is IDA's stupid way to actually express that rsi is loaded with 8 bytes from memory addressed relative to rip, if it was encoded as 48 8B 35 rel32. According to MOV documentation it is the MOV r64,r/m64 form with relative encoding REX.W + 8B /r which works only for addresses withing 2 GB range above|below rip. ModR/M=35h (mod=00b,reg=110b,r/m=101b) specifies RIP-relative encoding in 64bit mode, as Table 2-7 says.

    In 64bit mode the contents of cs,ds,es,ss are irrelevant; those segment override prefixes are fully ignored. But if one was present, you'd see it in the encoding. If it is BE609BF301, is should be disassembled as mov rsi,qword_1F39B60.
    If it is 2E488B3425609BF301, it corresponds with mov rsi,[cs:1F39B60h], according to Nasm disassembler. That's why it's so weird that IDA uses cs: to mean something else, when you'd expect it would only be used to indicate a 2E CS prefix in the machine code.

    Last but not the least is rsp+78h a fancy way of saying rbp by the disassembler?

    No. Disassembler never knows whether rbp=rsp+78h (without a very sophisticated heuristic analysis).