I took below example from https://kernelnewbies.org/FAQ/LikelyUnlikely
#include <stdio.h>
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
int main(char *argv[], int argc)
{
int a;
/* Get the value from somewhere GCC can't optimize */
a = atoi (argv[1]);
if (likely (a == 2))
a++;
else
a--;
printf ("%d\n", a);
return 0;
}
and compiled it https://godbolt.org/z/IC0aif with arm gcc 8.2 compiler
.
In the original link, they have tested it for X86 and the assembly output is different if likely
(in the if condition in above code) is replaced with unlikely
, which shows the optimisation performed by compiler for branch prediction.
But when I compile the above code for ARM (arm-gcc -O2), I don't see any difference in assembly code. Below is the output of ARM assembly in both the case - likely
and unlikely
main:
push {r4, lr}
ldr r0, [r0, #4]
bl atoi
cmp r0, #2
subne r1, r0, #1
moveq r1, #3
ldr r0, .L6
bl printf
mov r0, #0
pop {r4, pc}
.L6:
.word .LC0
.LC0:
.ascii "%d\012\000"
Why doesn't the compiler optimize for branch prediction in case of ARM ?
As @rici said, your code is simple enough that it can be realized by conditional instructions. You can see a difference, e.g., if you call functions which are implemented in a different compilation unit:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
// only forward declarations:
void foo();
void bar();
int main(char *argv[], int argc)
{
if (likely (argc == 2))
foo();
else
bar();
}
Changing likely
to unlikely
switches the order of the if
and else
branch, for ARM and x86: https://godbolt.org/z/UDzvf0. If this really makes a difference likely depends on the hardware you are running on, whether you call the function the first time (otherwise, the CPU-internal branch prediction likely has a higher influence than the order of the instructions), and probably many other things.