Search code examples
cassemblyx86inline-assembly

Why does cmov always return t_val?


The result I want to achieve with the above cmov function is, if pred=true, return t_val, otherwise return f_val. But in actual operation, t_val is returned every time.

#include<stdio.h>
#include <stdint.h>
#include <stdlib.h>
int cmov(uint8_t pred, uint32_t t_val, uint32_t f_val) {
uint32_t result;
 __asm__ volatile (
 "mov %2, %0;"
 "test %1, %1;"
 "cmovz %3, %0;"
 "test %2, %2;"
 : "=r" (result)
 : "r" (pred), "r" (t_val), "r" (f_val)
 : "cc"
 );
 return result;
 }
 
int main()  {  
  
     int a=1,b=4,c=5,d;
    int res = (a==3); //
    printf("res = %d\n",res);
    d = cmov(res,b,c);
    printf("d = %d\n",d);
    a=3;
    res = (a==3);
    d = cmov(res,b,c);
    printf("d = %d\n",d);
return 0;
};  

Solution

  • You're missing an early-clobber on the output ("=&r"), so probably GCC picks the same register for the output as one of the inputs, probably pred. So test %1,%1 is probably testing t_val (b). Single-step the asm with a debugger, and/or look at GCC's asm output. (On https://godbolt.org/ or with gcc -S).

    This seems really inefficient; Use a "+r"(result) constraint for the output (with uint32_t result=t_val;) so you don't need a mov in the asm template; let the compiler get result=t_val done for you, possibly by simply choosing the same register.

    The test %2,%2 after the cmoz is also doing nothing; you're not even using a GCC6 flag-output operand. It's a totally wasted instruction.

    Also, this doesn't need to be volatile. The output is a pure function of the inputs, and doesn't need to run at all if the output is unused.

    It's probably a bad idea to use inline asm at all for just a cmov; compile with -O3 and write your source such that GCC thinks its a good idea to do if-conversion into branchless code. inline asm destroys constant propagation and defeats other optimizations. And this way forces you to use a test instruction in the template, not reading FLAGS set from some earlier add or whatever, and not letting the compiler reuse the same FLAGS result for multiple cmov or other instructions. https://gcc.gnu.org/wiki/DontUseInlineAsm

    Or if you can't hand-hold the compiler into making asm you want, write more of your real use-case in asm, not just a wrapper around cmov.