Search code examples
assemblyx86masmtasmaddressing-mode

How to use two registers for addressing?


I have problems executing the code below

unique proc
    
    invoke lstrlen, esi
    cmp eax, 1
    jle quit
    mov ebx, 0; previous iterator
    mov edx, 0; next iterator
    dec eax
    mov ecx, eax
    inc eax
next:
    inc edx
    cmp [esi][ebx], [esi][edx]
    je skip
    cmp 
    inc ebx
    cmp ebx, edx
    dec ebx
    je skip
    inc ebx
    mov [esi][ebx], [esi][edx]
skip:
    loop next
    mov [esi][edx], '0'
quit:
    ret

unique endp

I am using indirect addressing here, so I expect

cmp [esi][ebx], [esi][edx]

to be replaced with

cmp ds:[esi][ebx], ds:[esi][edx]

Where am I wrong here?


Solution

  • Conventional instructions are limited to one memory operand

    You have specified the x86 tag in your question, that means that you are using Intel x86 instruction set.

    An Intel x86 instruction can have multiple operands, separated with commas in the assembly language. Operands can be: immediate, when a constant expression evaluates to an inline value in the opcode; register, when a value is in a processor register; or memory, when the value is in the RAM.

    You cannot use two memory operands in a single cmp instruction. You should split the cmp instructions in your code. Instead of a single instruction that you wish to use for two memory operands at once, use two instructions that each have one memory operand and one register operand. The first instruction will load the value from memory to a register, and the second instruction will compare the value from another memory location with that register.

    For example, instead of a single instruction

    cmp [esi][ebx], [esi][edx]
    

    use two instructions:

    mov al, [esi+edx]
    cmp [esi+ebx], al
    

    String instructions have two memory operands by index registers

    You can use a cmpsb instruction that, along with the other string instructions like movsb, is an exception in the matter that it technically has two memory operands. But the mode on how you can address the operands by the string instructions is fixed by the index registers, 'esi' and 'edi' (register size may differ), to specify the first and the second memory addresses, respectively. You cannot use other registers. At the assembly code level, two forms of this instruction are allowed: the explicit operand form and the no-operand form (e.g.cmpsb). The explicit operand form allows the use of symbols to explicitly specify the first and second addresses of the memory, i.e. cmps byte ptr ds:[esi], byte ptr es:[edi]. This explicit operand form is provided to allow documentation, but the documentation provided in this form may be misleading, because the symbols do not have to specify the correct source and destination addresses, and if you specify them incorrectly, like 'eax' rather than 'esi', this error may be ignored by some assemblers, like Turbo Assembler Version 5.4, and the 'esi' will be used instead. These index registers for the string operations are always implicitly assumed by the instruction opcode and are defined so you have no choice. The first memory address is always specified by DS:(RSI/ESI/SI), although you can change the segment register for the first memory address. The second memory address is always specified by the ES:(RDI/EDI/DI) with no choice even for the segment register. Besides that, you also have to set the direction flag, by either cld or std instruction, to specify whether the index registers registers should be increased or decreased after the operation. Only the comparison result of the two memory operands will update flags, not the result of the increase/decrease of the index registers. Please note that the explicit operand form may not be supported by all assemblers, so the Netwide Assembler, for example, gives an error on any instance of the explicit form. Although for the explicit form, Turbo Assembler will ignore the index registers that you specify, it will anyway check the segment registers specified. If you will specify other segment register for the second memory address, it will give an error "Can't override ES segment".