Search code examples
cx86neon

Arithmetic operations on 64-Bit double values using ARM Neon Intrinsic's in ARM64


I'm trying to implement a simple 64 Bit double addition operation using ARM Neon. I've come across this Question but there was no sample implementation using ARM intrinsic available in the answer. So any Help in providing a complete example is greatly appreciated. Here is what i have tried so far by using integer type registers.

Side Note:

Please note that i'm using intel/ARM_NEON_2_x86_SSE library for simulating this ARM Neon code using SSE instructions. Should i switch to native ARM neon to test this code?

int main()
{
    double Val1[2] = { 2.46574621,0.46546221};
    double Val2[2] = { 2.63565654,0.46574621};
    double Sum[2] = { 0.0,0.0 };
    double Sum_C[2] = { 0.0,0.0};   

    vst1q_s64(Sum,                      //Store int64x2_t
        vaddq_s64(                      //Add   int64x2_t
            vld1q_s64(&(Val1[0])),      //Load  int64x2_t
            vld1q_s64(&(Val2[0])) ));   //Load  int64x2_t

    for (size_t i = 0; i < 2; i++)
    {
        Sum_C[i] = Val1[i] + Val2[i];
        if (Sum_C[i] != Sum[i])
        {
            cout << "[Error]    Sum : " << Sum[i] << "  !=  " << Sum_C[i] << "\n";
        }
        else
            cout << "[Passed]   Sum : " << Sum[i] << "  ==  " << Sum_C[i] << "\n";
    }  

    cout << "\n";
}

[Error] Sum : -1.22535e-308     !=  5.1014
[Error] Sum : 1.93795e+307      !=  0.931208

Solution

  • Double precision isn't supported on aarch32 NEON.

    Therefore, if you target armv7-a while using the data type float64x2_t, it won't build.

    If your test platform is an aarch64 one with a 64-bit OS installed, just exclude the aarch32 target from your makefile.