I was playing around with compiler explorer, trying to learn a little more about ARM-Assembly. Im using arm64 msvc v19.latest. I noticed that I had one branch less like this:
int main(){
for(unsigned i = 0; i<8;)
i++;
return 0;
}
compared to the "conventional" way of writing a for-loop like this:
int main(){
for(unsigned i = 0; i<8;i++)
;
return 0;
}
Is it therefore more efficient to write the for-loop in an unconventional way? I'll paste in both asm to compare. First with the unconventional method:
;Flags[SingleProEpi] functionLength[52] RegF[0] RegI[0] H[0] frameChainReturn[UnChained] frameSize[16]
|main| PROC
|$LN6|
sub sp,sp,#0x10
mov w8,#0
str w8,[sp]
|$LN2@main|
ldr w8,[sp]
cmp w8,#8
bhs |$LN3@main|
ldr w8,[sp]
add w8,w8,#1
str w8,[sp]
b |$LN2@main|
|$LN3@main|
mov w0,#0
add sp,sp,#0x10
ret
ENDP ; |main|
and the convetional way:
;Flags[SingleProEpi] functionLength[56] RegF[0] RegI[0] H[0] frameChainReturn[UnChained] frameSize[16]
|main| PROC
|$LN6|
sub sp,sp,#0x10
mov w8,#0
str w8,[sp]
b |$LN4@main|
|$LN2@main|
ldr w8,[sp]
add w8,w8,#1
str w8,[sp]
|$LN4@main|
ldr w8,[sp]
cmp w8,#8
bhs |$LN3@main|
b |$LN2@main|
|$LN3@main|
mov w0,#0
add sp,sp,#0x10
ret
ENDP ; |main|
If you want optimized code, ask your compiler for it! There's no point in examining how optimized unoptimized code is.
-O3
completely eliminates the loop.
Compiler Explorer demo: standard
Compiler Explorer demo: non-standard
If we add something with a side-effect to the loop, we get the exact same result from both approaches.
Compiler Explorer demo: standard
Compiler Explorer demo: non-standard
That optimized code is the equivalent of
printf("%d\n", 1);
printf("%d\n", 2);
printf("%d\n", 3);
printf("%d\n", 4);
printf("%d\n", 5);
printf("%d\n", 6);
printf("%d\n", 7);
printf("%d\n", 8);