Search code examples
pythonlistaveragesublist

How to calculate average of sub-lists based on Timestamp values


I have 2 lists. The 1st list is the Timestamps (sec) when data was measured. And 2nd list contains data.

I want to calculate the average of data every 10 sec. Note that timestamp between 2 consecutive data points is not fixed.

Example:

Timestamp = [2, 5, 8, 11, 18, 23, 25, 28]
Data = [1, 2, 3, 4, 5, 6, 7, 8]

And the expected output is supposed to be:

Output = [average of [1,2,3] , average of [4,5] , average of [6,7,8]]

I was wondering if there is any built-in function in Python like average-if analysis to do it automatically.

Thank you for your help.


Solution

  • You can use for that the math function floor with defaultdict as

    from collections import defaultdict
    from math import floor
    
    timestamp = [2, 5, 8, 11, 18, 23, 25, 28]
    data = [1, 2, 3, 4, 5, 6, 7, 8]
    average_dc= defaultdict(list)
    for t, d in sorted(zip(timestamp, data), key=lambda x : x[0]):
         average_dc[math.floor(t / 10)].append(d)
    averages = [sum(i)/ len(i) for i in average_dc.values()]
    

    Output

    [2.0, 4.5, 7.0]
    

    The sorted(zip(timestamp, data), key=lambda x : x[0]) will concat the timestamp value with the value from data on the same index and then the for loop will insert to average_dc the relevant data value base on the related timestamp value.
    In the last line, the list comprehension will iterate over each list in the average_dc and will calculate the average of it.