floating-point precision mean convergence

Enhance convergence of mean computation of cosine or sine function

I'm computing the mean of values obtained from a*cos(ωt)+μ but it doesn't converge as I would expect. There is a residual error in the order of 10^-4 which I find quite big for 64 bit precision computation.

I would expect that the computed mean would converge to μ.

I tried different processing duration and ω values without succeeding in increasing the precision.

I also tried different ways to compute the average. One method is to simply accumulate the values and divide the result. Another method is to compute the mean incrementally with m += (x-m)/n and increasing n.

What worries me is that I also compute the variance with this mean and it also affect it's convergence.

Is there a way to increase the precision of the mean computation ? In the end I will have to sum thousands of cosine and sine values of different phase and same frequency. The result is still a cosine. Thus getting the mean computation right and as precise as possible is important.

Edit: here is the Go code I use for my testings.

You can run and test various parameters directly in the go playground at this location: https://go.dev/play/p/hwbYf3Y5tPO

package main

import (
    "fmt"
    "math"
)

const (
    lambda      = 3005
    amplitude   = 2.
    mean        = 5.
    nIterations = 1000_000
)

func cosFunc(lambda, amplitude, mean float64) func() float64 {
    var t float64
    pulsation := 2 * math.Pi / lambda
    return func() float64 {
        x := amplitude*math.Cos(pulsation*t) + mean
        t++
        return x
    }
}

func meanFunc() func(x float64) float64 {
    var n, m float64
    return func(x float64) float64 {
        n++
        m += (x - m) / n
        return m
    }
}

func main() {
    gen := cosFunc(lambda, amplitude, mean)
    mean := meanFunc()
    for i := 0; i < nIterations; i++ {
        x := gen()
        m := mean(x)
        fmt.Printf("x: %f mean: %g\n", x, m)
    }
}

Increasing the number of iterations or lambda doesn't increase much the precision. I suspect it is because x is a cosine function.

Solution

The calculated result is mathematically correct.

The code does not sample whole circles, and the deviation from mean is almost entirely due to the final fractional circle. The deviation is a mathematical property of the sequence, not an artifact of floating-point arithmetic.

With lambda = 3,005, the summation samples 3,005 points distributed around the circle. These points are repeated as the iterations go on. With nIterations = 1,000,000, the first 2,340 points are sampled ⌈1,000,000 / 3,005⌉ = 333 times, and the remaining 665 points are sampled 332 times. Given this, it is unsurprising the mean of the samples differs from the mean of complete circles. The magnitude of the differences, about 10⁻³, is proportional for the number of whole circles completed.

We can verify the errors of floating-point arithmetic are slight by setting nIterations to an integer multiple of lambda. Mathematically, this would produce a sequence with mean 5 (from the mean parameter), so any deviation from 5 is due to imperfect arithmetic. With nIterations set to lambda (3,005), I get a final mean of 5.00000000000000710542735760100185871124267578125 (using code from the question reimplemented in C). With nIterations set to 1000•lambda (3,005,000), I get a final mean of 5.0000000000002788880237858393229544162750244140625.

In the latter cases of nIterations, we see tiny differences that are due to errors in floating-point arithmetic. With the original nIterations, the differences is almost entirely due to the incomplete final circle. The difference is mathematically real, not an artifact of floating-point arithmetic.

Note that the code at the Go playground link differs from the code in the question. It prints the final result using an extra mean(gen()) evaluation, so it effectively performs nIterations+1 iterations. To reproduce the whole-circle results, set nIterations to one less than the desired multiple, such as 3,004 or 3,004,999 instead of 3,005 or 3,005,000.

(Bonus note: the 3,005-iteration result is eight ULP from 5, and the 3,005,000-iteration result is 314 ULP from five.)

Finding the desired mean.

Per additional request from OP, here is code (in C) that finds the mean.

#include <math.h>


static double f(void)
{
    static const double π = 0x3.243f6a8885a308d313198a2e03707344ap0;

    static const double
        λ = 3005,
        a = 2,
        µ = 5,
        ω = 2 * π / λ;

    static double t = 1234.5678;

    return a*cos(ω * t++) + µ;
}


#include <stdio.h>


/*  Given a function f that returns a*cos(ω*t)+µ for unknown constants a, ω,
    and µ and a t that starts with an unknown value and increments by one in
    each call, calculate µ.

    We do this by finding extrema in the function to estimate its period then
    sampling points equally spaced around the cycle and applying an algebraic
    solution from Maple.  (Judging from the simplicity of the solution, I
    expect it comes from simple trigonometric properties, but I do not know its
    derivation.)  Maple was asked to solve:

        a*cos(w*(t+0))+u = t1,
        a*cos(w*(t+1))+u = t2,
        a*cos(w*(t+2))+u = t3, and
        a*cos(w*(t+3))+u = t4,
    
    for u, a, w, and t.  Since w is an unknown constant, the scale of 0, 1, 2,
    and 3 are irrelevant to the solution for u, so it will work for any equally
    spaced points in the series produced by the function (barring issues such
    as all the points having the same phase so they represent only one point in
    the cycle instead of four).  Since t is unknown, it will work for any
    starting point.

    Note that if the wavelength, λ (see sample f above), is an even integer, we
    can simply average the two extremes to find the mean.  However, λ is
    unknown, and the solution below is general.
    
    Mathematically, adjacent samples would suffice.  However, testing showed
    artifacts due to floating-point precision.  So spacing that is a
    substantial portion of the period is used.  I have not analyzed the
    numerical behavior in this situation.
*/
int main(void)
{
    //  Look for some change in the function.
    double t0 = f(), t1;
    for (int i = 0; i < 100; ++i)
    {
        t1 = f();
        if (t0 != t1)
            break;
    }

    if (t0 == t1)
        printf("Function appears to be constant, mean is %.99g\n", t0);

    else
    {
        //  Find first change of direction and then distance to second change.
        int count = 0;
        if (t0 < t1)
        {
            do {          t0 = t1; t1 = f(); } while (t0 <= t1);
            do { ++count; t0 = t1; t1 = f(); } while (t0 >= t1);
        }
        else
        {
            do {          t0 = t1; t1 = f(); } while (t0 >= t1);
            do { ++count; t0 = t1; t1 = f(); } while (t0 <= t1);
        }

        //  Sample function at roughly quarter-circle spacings.
        count /= 2;
        if (count < 1) count = 1;

        for (int i = 1; i < count; ++i) f();
        double t2 = f();
        for (int i = 1; i < count; ++i) f();
        double t3 = f();
        for (int i = 1; i < count; ++i) f();
        double t4 = f();

        //  Apply solution from Maple.
        printf("u = %.99g.\n",
            (t3*t3 + t3*t1 - t2*t2 - t2*t4) / (t1 + 3*t3 - 3*t2 - t4));
    }
}

Sample output:

u = 4.99999999999999733546474089962430298328399658203125.