Search code examples
pythonlistnumpyaverage

Average of each consecutive segment in a list


I have a list:

sample_list = array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])

I want to calculate the average of every, say 4 elements. But not 4 elements separately, rather the first 4:

1,2,3,4

followed by:

2,3,4,5

followed by:

3,4,5,6

and so on.

The result will be an array or list of average between every 4 elements in the first list.

Output:

array([2.5, 3.5, 4.5, ...])

My attempt:

sample_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
splits = 4

def avgerage_splits(data):
    datasum = 0
    count = 0 
    for num in data:
        datasum += num
    count += 1
    if count == splits: 
        yield datasum / splits
        datasum = count = 0
if count: 
    yield datasum / count

print(list(average_splits(sample_list)))

[1.5, 3.5, 5.5, 7.5, 9.5, 11.0]

This is not the output I need as this calculates the average of every 4 elements before moving onto a new set of 4 elements. I want to only move one element up in the list and calculate the average of those 4 and so on.


Solution

  • If numpy is an option a simple way to achieve this is to use np.convolve, which can be used to compute a rolling mean when convolving with an array of np.ones:

    import numpy as np
    sample_list = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16], dtype=float)
    
    w = 4
    np.convolve(sample_list, np.ones(w), 'valid') / w
    

    Output

    array([ 2.5,  3.5,  4.5,  5.5,  6.5,  7.5,  8.5,  9.5, 10.5, 11.5, 12.5,
       13.5, 14.5])
    

    Details

    np.convolve is performing a discrete convolution between the two input arrays. In this case np.ones(w), which will be an array of as many ones as the specified window length (4 in this case) array([1., 1., 1., 1.]) and sample_list.

    The following list comprehension aims to replicate the way np.convolve is computing the output values:

    w = 4
    np.array([sum(ones*sample_list[m:m+w]) for m in range(len(sample_list)-(w-1))]) / w 
    
    array([ 2.5,  3.5,  4.5,  5.5,  6.5,  7.5,  8.5,  9.5, 10.5, 11.5, 12.5,
       13.5, 14.5])
    

    So at each iteration it will take the inner product between the array of ones and the current window of sample_list .

    Bellow is an example of how the first outputs are computed so that its a little clearer. Note that in this case the used mode specified for the convolution is valid, which means that, the overlap is specified to be always complete:

    [1,1,1,1]
    [1,2,3,4,5,6,7,8...]
    = (1*1 + 1*2 + 1*3 + 1*4) / 4 = 2.5
    

    And the following as:

      [1,1,1,1]
    [1,2,3,4,5,6,7,8...]
    = (1*2 + 1*3 + 1*4 + 1*5) / 4 = 3.5
    

    And so on, yielding as mentioned earlier a moving average of sample_list.