Search code examples
cgcclinux-kernelarmlikely-unlikely

Why doesn't likely and unlikely macros have any effect on ARM assembly code?


I took below example from https://kernelnewbies.org/FAQ/LikelyUnlikely

#include <stdio.h>
#define likely(x)    __builtin_expect(!!(x), 1)
#define unlikely(x)  __builtin_expect(!!(x), 0)

int main(char *argv[], int argc)
{
   int a;

   /* Get the value from somewhere GCC can't optimize */
   a = atoi (argv[1]);

   if (likely (a == 2))
      a++;
   else
      a--;

   printf ("%d\n", a);

   return 0;
}

and compiled it https://godbolt.org/z/IC0aif with arm gcc 8.2 compiler.

In the original link, they have tested it for X86 and the assembly output is different if likely(in the if condition in above code) is replaced with unlikely, which shows the optimisation performed by compiler for branch prediction.

But when I compile the above code for ARM (arm-gcc -O2), I don't see any difference in assembly code. Below is the output of ARM assembly in both the case - likely and unlikely

main:
        push    {r4, lr}
        ldr     r0, [r0, #4]
        bl      atoi
        cmp     r0, #2
        subne   r1, r0, #1
        moveq   r1, #3
        ldr     r0, .L6
        bl      printf
        mov     r0, #0
        pop     {r4, pc}
.L6:
        .word   .LC0
.LC0:
        .ascii  "%d\012\000"

Why doesn't the compiler optimize for branch prediction in case of ARM ?


Solution

  • As @rici said, your code is simple enough that it can be realized by conditional instructions. You can see a difference, e.g., if you call functions which are implemented in a different compilation unit:

    #define likely(x)    __builtin_expect(!!(x), 1)
    #define unlikely(x)  __builtin_expect(!!(x), 0)
    
    // only forward declarations:
    void foo();
    void bar();
    
    int main(char *argv[], int argc)
    {
       if (likely (argc == 2))
          foo();
       else
          bar();
    }
    

    Changing likely to unlikely switches the order of the if and else branch, for ARM and x86: https://godbolt.org/z/UDzvf0. If this really makes a difference likely depends on the hardware you are running on, whether you call the function the first time (otherwise, the CPU-internal branch prediction likely has a higher influence than the order of the instructions), and probably many other things.