Search code examples
pythonlistloopsaverage

Replace None in list with average of last 3 non-None entries when the average is above the same entry in another list


I have two lists:

dataa = [11, 18, 84, 51, 82, 1, 19, 45, 83, 22]
datab = [1, None, 40, 45, None, None, 23, 24, None, None]

I need to replace all None in datab for any instance where the prior 3 entries are > than the data entry (see walk-through example below). Ignore entries where there are not 3 prior non-None entries to average to make the comparison to dataa.

My first attempt was this:

for i in range(len(dataa)):
        if (datab[i] == None):
                a = (datab[i-3]+datab[i-2]+datab[i-1])/3
                if ((datab[i-3]+datab[i-2]+datab[i-1])/3 > dataa[i]):
                        datab[i] = dataa[i]

It errors trying to compute the average of the prior three in the case where one of the prior 3 are None. I tried to keep a running total, but this fails for some of them.

c = 0;
a = 0;
for i in range(len(dataa)):
        c = c + 1
        if (datab[i] == None):
                if (a > dataa[i]):
                        datab[i] = a
        else:
                if (c > 2):
                        a = (a * 3 + datab[i])/3

This also did not work as expected.

From this sample data, I expected:

  • Entry 1, 2, and 3 have no average, so leave as is.
  • Entry 5 is None in datab and 82 in dataa. Since (1+40+45)/3 = 28.66 we also leave as is.
  • Entry 6 is None in datab and 1 in dataa. The 3 prior non-None average are greater (28.66 > 1), so set to the 28.66 average.
  • Entry 9 is None, but (28.66+23+24)/3 = 25.22 is not greater than 83, so leave as is.
  • Entry 10 is None and the 3 prior non-None average are greater (25.22>22), so set it to the 25.22 average.

The correct expected output:

[1, None, 40, 45, None, 28.66, 23, 24, None, 25.22]

Solution

  • Let's use a collections.deque to keep track of our window of numbers to average, since popping off the top of a deque is cheaper than popping off the top of a list.

    Thanks to @ShadowRanger for pointing out the maxlen feature of deque, which allows us to append an element, and the deque automatically pops the left element if needed.

    from collections import deque
    
    dataa = [11, 18, 84, 51, 82, 1, 19, 45, 83, 22]
    datab = [1, None, 40, 45, None, None, 23, 24, None, None]
    
    result = []
    
    moving_avg = 0
    sliding_window = deque(maxlen=3)
    
    # Iterate over the two lists simultaneously
    for a, b in zip(dataa, datab):
        # If b already has a value
        # Or the window has less than three items
        # Or the average is less than the element of dataa
        if b is not None or len(sliding_window) < 3 or moving_avg < a:
            # Append the element of datab to result
            result.append(b)
        else:
            # Else, append the moving average
            result.append(moving_avg)
    
        # If the value we just appended to our result is not None
        # Then append it to the sliding window
        if result[-1] is not None:
            sliding_window.append(result[-1])
        
        # Recalculate moving average
        moving_avg = sum(sliding_window) / len(sliding_window)
            
            
    print(result) 
    # [1, None, 40, 45, None, 28.666666666666668, 23, 24, None, 25.222222222222225]
    

    You could save on some computation time by keeping track of the element being popped off the deque and using that to calculate the moving average, but for a deque of size 3 that shouldn't be such a big deal anyway.