I sometimes use this pattern to iterate array of something:
mov [rsp+.r12], r12 ; Choose a register that calls inside the loop won't modify
mov r12, -1
.i:
inc r12
cmp r12, [rbp-.array_size]
je .end_i
; ... program logic ...
jmp .i
.end_i:
mov r12, [rsp+.r12]
I understand that it is enough to test for equality but should not one "securely" test for "greater or equal"(prevent situation that will not occur).
Should one use je or jge in this cases?
I am asking about concrete tip that can reduce likelihood of introducing bugs.
I've always kind of liked the idea of testing for a range instead of just for equality, in case a bit flips accidentally or something. But in x86 asm, keep in mind that cmp/jge
can't macro-fuse on Core2 (in 32-bit mode), but cmp/je
can. I thought that was going to be more relevant until I checked Agner Fog's microarch pdf and found that it was only Core2, not Nehalem, that couldn't fuse that, since macro-fusion doesn't work at all in 64-bit mode on Core2. (Later microarchitectures don't have that limitation, and can macro-fuse more and more combinations.)
Depending on the counter, you can usually count down without a CMP at all (dec/jnz). And often you know it doesn't need to be 64-bit, so you can use dec esi / jnz
or whatever. dec esi / jge
does work for signed counters, but dec
doesn't set CF so you can't (usefully) use JA.
Your loop structure, with an if() break
in the middle and a jmp at the end, is not idiomatic for asm. Normal is:
mov ecx, 100
.loop: ; do{
;; stuff
dec ecx
jge .loop ; }while(--ecx >= 0)
You can use jg to only restart the loop with positive ecx, i.e. loop from 100..1 instead of 100..0.
Having a not-taken conditional branch and a taken unconditional branch in a loop is less efficient.
Expanding on discussion in question comments about saving/restoring r12: Normally you'd do something like:
my_func:
; push rbp
; mov rbp, rsp ; optional: make a stack frame
push rbx ; save the caller's value so we can use it
sub rsp, 32 ; reserve some space
imul edi, esi, 11 ; calculate something that we want to pass as an arg to foo
mov ebx, edi ; and save it in ebx
call foo
add eax, ebx ; and use value. If we don't need the value in rbx anymore, we can use the register for something else later.
... ;; calculate an array size in ecx
test ecx, ecx ; test for the special case of zero iterations *outside* the loop, instead of adding stuff inside. We can skip some of the loop setup/cleanup as well.
jz .skip_the_loop
; now use rbx as a loop counter
mov ebx, ecx
.loop:
lea edi, [rbx + rbx*4 + 10]
call bar ; bar(5*ebx+10);
; do something with the return value? In real code, you would usually want at least one more call-preserved register, but let's keep the example simple
dec ebx
jnz .loop
.skip_the_loop:
add rsp, 32 ; epilogue
pop rbx
;pop rbp ; pointless to use LEAVE; rsp had to already be pointing to the right place for POP RBX
ret
Notice how we use rbx for a couple things inside the function, but only save/restore it once.