floating-point average precision numerical-methods

Safe computation of Geometric Mean

I am looking for a reason to choose one of the following way to compute geometric mean of a long series of floating point x:

take nth root of each x, then multiply all of them
multiply all of them, then take nth root

I have heard that for floating point numbers, multiplications and divisions lose less information than additions and subtractions do. Therefore I am not considering the sum-exponent trick.

Should I compute geometric mean via 1 or 2, and why?

Update 1, in response to comment:

All x is less than 1 and in double precision. Their order of magnitude ranges between 10^-1 to 10^-6. Please assume the most common method for computing n-th root, since I am using a built-in function of a programming language. Instead of overflow, I am worried about underflow (?) since all values are less than 1. You can assume the length of the series of x to be in order of 10^8

Solution

In general, in a sequence of floating-point operations that also involves contracting operations such as square root or cube root, it is advantageous from an accuracy perspective to perform the contracting operations last. For example, sqrt(1.0/x) is more accurate than 1.0/sqrt(x), sqrt(a*b) is more accurate than sqrt(a)*sqrt(b), and cbrt(a*b*c) is more accurate than cbrt(a)*cbrt(b)*cbrt(c).

As a consequence, unless there is a danger of overflowing or underflowing the chosen floating-point format, such as IEEE-754 binary64 (e.g. double in C/C++), in intermediate computation, method [2] should be chosen. Additional aspect relevant to accuracy: if n-th root is computed by exponentiation, such as pow() in C/C++, additional error will be introduced with every computed root, as explained in case of cube root in my answer to this question. Finally, the computation of the n-th root will be slower than a multiplication, so doing only multiplies with a final root computation at the end will also be a superior approach performancewise.

Very accurate results can be achieved with method [2] by using a compensated product (akin to the compensated addition provided by Kahan summation). See the following paper for details:

Stef Graillat, "Accurate Floating-Point Product and Exponentiation", IEEE Transactions on Computers, Vol. 58, No. 7, July 2009, pp. 994-1000 (online)

This compensated product can be computed particularly efficient on systems that provide the FMA (fused multiply-add) operation in hardware. This is the case for all the common modern processor architectures, both CPUs and GPUs. C/C++ provide convenient access to this via the standard math functions fma(), fmaf().

Update: Asker clarified in comment that the risk of underflow is imminent since there are on the order of 10⁸ factors in [10^-6, 10^-1]. One possible workaround mentioned by @Yves Daoust in a comment is to separate the factors into mantissa and exponent and accumulate them separately. Whether this is practical will depend on the floating-point environment. While C and C++ provide the standard function frexp() for performing this splitting, this function may not be very fast.