Search code examples
c++clangintelintrinsicsbuilt-in

Difference between __builtin_addcll and _addcarry_u64


Good morning (or good evening),

I was reading some legacy code in my company and I found the following intrinsic was used:

_addcarry_u64

However, I have to port this code on a platform that does not support it. After some research, I stumbled upon a Clang builtin that seemed to do the exact same job:

__builtin_addcll

It has the same arguments (not in the same order but still), however, since there is little to no documentation about it even on Clang website, I have no clue if they truly are the same or not, especially since return types or argument order is not the same.

I tried to use a macro to remap already used arguments however it did not work (I know it's dirty).

#define _addcarry_u64(carryIn, src1, src2, carryOut) __builtin_addcll(src1, src2, carryIn, carryOut)

I feel I will have to wrap it in a function for it to work correctly (still I'm not sure it would work)

Can anyone point me to a documentation or to anything that could solve my problem?


Solution

  • The signature for Clang is

    unsigned long long __builtin_addcll(unsigned long long x,
                                        unsigned long long y,
                                        unsigned long long carryin,
                                        unsigned long long *carryout);
    

    and the Intel version is

    unsigned char _addcarry_u64 (unsigned char c_in,
                                 unsigned __int64 a,
                                 unsigned __int64 b,
                                 unsigned __int64 * out)
    

    So your macro is incorrect. _addcarry_u64 adds 2 number and returns the carry, and the sum is returned via a pointer as stated by Intel

    Add unsigned 64-bit integers a and b with unsigned 8-bit carry-in c_in (carry flag), and store the unsigned 64-bit result in out, and the carry-out in dst (carry or overflow flag).

    __builtin_addcll OTOH returns the sum directly and the carry out is returned via a pointer as you can see right from your Clang documentation link

    Clang provides a set of builtins which expose multiprecision arithmetic in a manner amenable to C. They all have the following form:

    unsigned x = ..., y = ..., carryin = ..., carryout;
    unsigned sum = __builtin_addc(x, y, carryin, &carryout);
    

    So the equivalent to

    int carryOut = _addcarry_u64(carryIn, x, y, &sum);
    

    in Clang is

    auto sum = __builtin_addcll(x, y, carryIn, &carryOut);