Search code examples
assemblyx86idiomsloop-counter

how to test condition properly: je or jge


I sometimes use this pattern to iterate array of something:

    mov [rsp+.r12], r12 ; Choose a register that calls inside the loop won't modify
    mov r12, -1
.i:
    inc r12
    cmp r12, [rbp-.array_size]
    je .end_i
    ; ... program logic ...
    jmp .i
.end_i:
    mov r12, [rsp+.r12]

I understand that it is enough to test for equality but should not one "securely" test for "greater or equal"(prevent situation that will not occur).

Should one use je or jge in this cases?

I am asking about concrete tip that can reduce likelihood of introducing bugs.


Solution

  • I've always kind of liked the idea of testing for a range instead of just for equality, in case a bit flips accidentally or something. But in x86 asm, keep in mind that cmp/jge can't macro-fuse on Core2 (in 32-bit mode), but cmp/je can. I thought that was going to be more relevant until I checked Agner Fog's microarch pdf and found that it was only Core2, not Nehalem, that couldn't fuse that, since macro-fusion doesn't work at all in 64-bit mode on Core2. (Later microarchitectures don't have that limitation, and can macro-fuse more and more combinations.)

    Depending on the counter, you can usually count down without a CMP at all (dec/jnz). And often you know it doesn't need to be 64-bit, so you can use dec esi / jnz or whatever. dec esi / jge does work for signed counters, but dec doesn't set CF so you can't (usefully) use JA.

    Your loop structure, with an if() break in the middle and a jmp at the end, is not idiomatic for asm. Normal is:

    mov ecx, 100
    
    .loop:             ; do{
        ;; stuff
        dec ecx
        jge .loop      ; }while(--ecx >= 0)
    

    You can use jg to only restart the loop with positive ecx, i.e. loop from 100..1 instead of 100..0.

    Having a not-taken conditional branch and a taken unconditional branch in a loop is less efficient.


    Expanding on discussion in question comments about saving/restoring r12: Normally you'd do something like:

    my_func:
        ; push rbp
        ; mov  rbp, rsp      ; optional: make a stack frame
    
        push   rbx           ; save the caller's value so we can use it
        sub    rsp, 32       ; reserve some space
    
        imul   edi, esi, 11   ; calculate something that we want to pass as an arg to foo
        mov    ebx, edi       ; and save it in ebx
        call   foo
        add    eax, ebx       ; and use value.  If we don't need the value in rbx anymore, we can use the register for something else later.
    
        ...  ;; calculate an array size in ecx
    
        test   ecx, ecx                ; test for the special case of zero iterations *outside* the loop, instead of adding stuff inside.  We can skip some of the loop setup/cleanup as well.
        jz    .skip_the_loop
    
        ; now use rbx as a loop counter
        mov    ebx, ecx
    .loop:
        lea    edi, [rbx + rbx*4 + 10]
        call   bar                     ; bar(5*ebx+10);
        ; do something with the return value?  In real code, you would usually want at least one more call-preserved register, but let's keep the example simple
        dec    ebx
        jnz    .loop
    .skip_the_loop:
    
        add   rsp, 32         ; epilogue
        pop   rbx
    
        ;pop  rbp             ; pointless to use LEAVE; rsp had to already be pointing to the right place for POP RBX
        ret
    

    Notice how we use rbx for a couple things inside the function, but only save/restore it once.