how to avoid the potential of an overflow when computing an average of times?

I am writing a function for getting the average of the clocks it takes to call a specific void (*)(void) aka void -> void function a specific number of times.

I am worried that it if the sample size gets too large, the sum of the observations will overflow and make the average invalid.

is there a standard approach to removing the possibility of sum overflowing in these kinds of problems?

Note: I understand that this example is too naive to conclude anything about performance; I am interested eliminating the possibility of sum overflow, not concluding anything performance wise.

Note2: I also understand that a 64 bit unsigned number will not realistically overflow unless the program is run for hundreds of years, but I am curious if it is possible to eliminate this assumption too.

Here is my self contained code:

#include <Windows.h>
#include <stdio.h>

/**
 * i want to parametrize the type which is used to store sample size
 * to see whether it impacts performance 
 */
template <typename sampleunit_t>
static inline ULONGLONG AveragePerformanceClocks (void (*f)(),  sampleunit_t nSamples)
{
    ULONGLONG sum;
    sampleunit_t i;    

    sum = 0;

    for (i = 0; i < nSamples; ++i) {
        LARGE_INTEGER t1; 
        LARGE_INTEGER t2;
        ULONGLONG dt;

        QueryPerformanceCounter(&t1);
        f();        
        QueryPerformanceCounter(&t2);

        dt = t2.QuadPart - t1.QuadPart;

        // sum may possibly overflow if program runs long enough with
        // a large enough nSamples
        sum += dt;
    }


    return (ULONGLONG)(sum / nSamples);
}

/* a cdecl callback that consumes time */
static void test1() 
{
    // don't optimize
    volatile int i;

    for (i = 0; i < 10000; ++i) {

    }
}

int main(int argc, char **argv)
{
    ULONGLONG avg;

    avg = AveragePerformanceClocks<BYTE>(test1, 255);    
    printf("average clocks(truncated): %llu.\n", avg);   

    avg = AveragePerformanceClocks<WORD>(test1, 255);    
    printf("average clocks(truncated): %llu.\n", avg);   

    avg = AveragePerformanceClocks<DWORD>(test1, 255);    
    printf("average clocks(truncated): %llu.\n", avg);   

    avg = AveragePerformanceClocks<ULONGLONG>(test1, 255);    
    printf("average clocks(truncated): %llu.\n", avg);   

    system("pause");

    return 0;
}

Solution

The average of the first n elements is

          SUM
Average = ---
           n

The next element Mi is

           (SUM + Mi)
Average2 = ----------
              n + 1

So given the current average, it is possible to find the next average with the new reading.

           (Average * n + Mi )
Average2 = -------------------
                  n + 1

This can then be changed to an equation which doesn't increase

                       n      Mi
Average2 = Average * ----- + -----
                     n + 1   n + 1

In practice for timing, the size of time will fit within the datatype of the computer.

As pointed out, this needs to use a floating point representation, and whilst will not fail due to overflow, can still fail when n/(n+1) is smaller than the accuracy of the floating point fraction part.

Update

From incremental average

There is a better reorganization.

                       Mi - Average
Average2 = Average  +  -------------
                           n + 1

It is better, because it only has one division.