Search code examples
assemblyoptimizationnasmx86-16

How can I move the entire data segment 2 bytes to the right efficiently?


I want to move the entire data segment 2 bytes to the right while optimizing for size in nasm,

My best Idea is:

    std
    pusha
    push es
    push ds
    pop es
    mov si, bp
    lea cx, [bp+0x1]
    lea di, [bp+0x2]
    rep movsb
    pop es
    popa

But it takes 17 bytes...


Solution

  • 11 bytes

    It seems the overhead of setting up for rep movsb is not worth it, so let's do a simple loop instead.

         4 00000000 89EB                            mov bx, bp
         5 00000002 43                              inc bx
         6                                  again:
         7 00000003 4B                              dec bx
         8 00000004 8A07                            mov al, [bx]
         9 00000006 884702                          mov [bx+2], al
        10 00000009 75F8                            jnz again
    

    The awkward inc bx at the beginning is to avoid an off-by-one error, since we have to copy bp+1 bytes.

    If not for the quirk that dec doesn't set the carry flag, we could have shaved off one byte with:

         4 00000000 89EB                            mov bx, bp
         5                                  again:
         6 00000002 8A07                            mov al, [bx]
         7 00000004 884702                          mov [bx+2], al
         8 00000007 4B                              dec bx
         9 00000008 72F8                            jnc again ; BUG
    

    so that we effectively compare bx with -1 instead of with 0. However, if bp is guaranteed to be less than 32K, then you could use jns again and be down to 10 bytes.

    I also tried some versions using lodsb / mov [si+3], al with df=1 to do the load and decrement together, but the problem is that since lodsb doesn't set flags, we need something like cmp si, -1 to check for loop termination, and that's three more bytes. So I couldn't get that version shorter than 11.