Search code examples
pythonpython-3.xnumpydata-processing

Averaging values with irregular time intervals


I have several pairs of arrays of measurements and the times at which the measurements were taken that I want to average. Unfortunately the times at which these measurements were taken isn't regular or the same for each pair.

My idea for averaging them is to create a new array with the value at each second then average these. It works but it seems a bit clumsy and means I have to create many unnecessarily long arrays.

Example Inputs

m1 = [0.4, 0.6, 0.2]
t1 = [0.0, 2.4, 5.2]

m2 = [1.0, 1.4, 1.0]
t2 = [0.0, 3.6, 4.8]

Generated Regular Arrays for values at each second

r1 = [0.4, 0.4, 0.4, 0.6, 0.6, 0.6, 0.2]
r2 = [1.0, 1.0, 1.0, 1.0, 1.4, 1.0]

Average values up to length of shortest array

a = [0.7, 0.7, 0.7, 0.8, 1.0, 0.8]

My attempt given list of measurement arrays measurements and respective list of time interval arrays times

def granulate(values, times):
    count = 0
    regular_values = []
    for index, x in enumerate(times):
        while count <= x:
            regular_values.append(values[index])
            count += 1
    return np.array(regular_values)

processed_measurements = [granulate(m, t) for m, t in zip(measurements, times)]
min_length = min(len(m) for m in processed_measurements )
processed_measurements = [m[:min_length] for m in processed_measurements]
average_measurement = np.mean(processed_measurements, axis=0)

Is there a better way to do it, ideally using numpy functions?


Solution

  • This will average to closest second:

    time_series = np.arange(np.stack((t1, t2)).max())
    np.mean([m1[abs(t1-time_series[:,None]).argmin(axis=1)], m2[abs(t2-time_series[:,None]).argmin(axis=1)]], axis=0)
    

    If you want to floor times to each second (with possibility of generalizing to more arrays):

    m = [m1, m2]
    t = [t1, t2]
    m_t=[]
    time_series = np.arange(np.stack(t).max())
    for i in range(len(t)):
      time_diff = time_series-t[i][:,None]
      m_t.append(m[i][np.where(time_diff > 0, time_diff, np.inf).argmin(axis=0)])
    average = np.mean(m_t, axis=0)
    

    output:

    [0.7 0.7 0.7 0.8 1.  0.8]