Search code examples
armintrinsicssve

ARM SVE Left-to-right vs. tree reduction


I am currently porting some applications to use the ARM SVE features with the intrinsic functions as defined in ARM C Language extensions for SVE.

Upon checking the documentation I have come across two functions to sum up elements of the floating point vector using reduction. That is using left-to-right and tree based reduction.

float64_t svadda[_f64](svbool_t pg, float64_t initial, svfloat64_t op);

float64_t svaddv[_f64](svbool_t pg, svfloat64_t op);

Documentation:

These functions (ADDV) sum all active elements of a floating-point vector. They use a tree-based rather than left-to-right reduction, so the result might not be the same as that produced by ADDA."

Why would a tree-based reduction differ from left-to-right reduction? Do they mean this because of the rounding errors or am I missing something?


Solution

  • Yes, floating point math is not quite associative because of rounding temporaries, so it matters what order you do the operations.

    You might need strictly left-to-right order to exactly implement the right order of operations, otherwise normally you'd hsum by extracting the high half to another vector and then vertically adding to the first vector. Then repeat this narrowing until you're down to a single element.