Search code examples
assemblyx86-16emu8086

Variable offset from BP (emu8086 assembly)


A part of my compiler assignment includes translating a C program to 8086 assembly. Say I have the following in C:

int a[3];

The translated assembly code for this array declaration looks like this (assuming 0 initialization by default, and I have to use stack allocation for local variables):

MOV BP, SP
PUSH 0 ; [BP-2] refers to a[0]
PUSH 0 ; [BP-4] refers to a[1] 
PUSH 0 ; [BP-6] refers to a[2]

Say I have to translate the following C code to assembly:

a[2]=5;

The index gets calculated like this:

; offset from BP for a[idx] = offset for a[0] + idx * 2

MOV AX, 2 ; AX = idx
MOV BX, 2 ; multiplier
MUL BX    ; DX:AX = idx * 2 (*ignore DX for now*)
MOV BX, 2 ; AX = offset for a[0] = 2
ADD AX, BX; AX = 4 + 2 = 6
MOV [BP-AX], 5

The assignment specification guarantees index*2 will never exceed 16 bytes, so DX will always contain 0000H after multiplication.

The problem lies in the last line MOV [BP-AX], 5. AX cannot be subtracted from BP, but for this purpose I need to do exactly that. How do I get around this problem?


Solution

  • Your memory layout is backwards. Arrays should always be laid out in memory from low to high addresses, even when they are on the stack. So you should have

    a[0] at bp-6
    a[1] at bp-4
    a[2] at bp-2
    

    Thus your code should look more like:

    ;; compute the offset 4 in AX as you have already done
    MOV DI, AX
    MOV WORD PTR [BP-6+DI], 5
    

    So the base of the array is always at BP-6, and then indexing into it always involves addition, not subtraction. You can't use AX as an index register on the 16-bit 8086, but you can use SI or DI. (As ecm notes, you also need WORD PTR to tell the assembler to generate a two-byte store instruction, as opposed to one-byte.)

    (You might be wondering why "subtraction" of a constant displacement is allowed in an effective address, while subtraction of an index register is not. Well, it technically isn't; the instruction can only encode addition of a 16-bit constant displacement. But the assembler takes care of this for you, encoding [BP-6+DI] as the addition of the constant -6. It's equivalent to [BP+(-6)+DI] or [BP+0FFFAh+DI].)

    You could make this more efficient by computing the offset in DI instead of AX in the first place, avoiding the extra MOV DI, AX. Also, if your compiler is able to figure out that sizeof(int) is the constant 2, then the multiplication by 2 should be done with SHL instead of MUL for efficiency. So it probably wants to look more like

    MOV DI, 2 ; or some code choosing index 2 at runtime
    SHL DI, 1 ; multiply by sizeof(int) which is 2
    MOV WORD PTR [BP-6+DI], 5
    

    Of course if the index 2 is really a constant, then you would ideally optimize the whole thing into MOV WORD PTR [BP-2], 5.


    More generally, if you can't do what you want in one instruction, just emit more instructions to do it in multiple steps. In some cases you might need to use additional registers.

    If you really did want to have your array backwards and achieve the effect you originally asked about, you could do, for instance:

    ; compute offset in AX as you have done
    MOV DI, AX
    NEG DI
    MOV WORD PTR [BP+DI], 5
    

    You could also compute the entire address separately in another register. BX, SI or DI would work. Note that they address the DS segment by default, whereas BP addresses the SS segment. So if you are in a code model where the stack and data segment are possibly different, you'll need a segment override.

    ; compute offset in AX as you have done
    MOV BX, BP
    SUB BX, AX
    MOV WORD PTR SS:[BX], 5