Search code examples
pythontensorflowmoving-averagepooling

Average pooling with window over variable length sequences


I have a tensor in of shape (batch_size, features, steps) and want to get an output tensor out of the same shape by average pooling over the time dimension (steps) with a window size of 2k+1, that is:

out[b,f,t] = 1/(2k+1) sum_{t'=t-k,...,t+k} in[b,f,t']

For time steps where there are no k preceding and succeeding time steps, I only want to calculate the average on the existing time steps.

However, the sequences in the tensor have variable length and are padded with zeros accordingly, I have the sequence lengths stored in another tensor (and could e.g. create a mask with them).

How can I perform this operation with masking and a window size?


Solution

  • As far as I know, there is no such operation in TensorFlow. However, one can use a combination of two unmasked pooling operations, here written in pseudocode:

    1. Let seq_mask be a sequence mask of shape (batch_size, time)
    2. Let in_pooled be a the tensor in with unmasked average pooling
    3. Let seq_mask_pooled be the tensor seq_mask with unmasked average pooling with the same pool size
    4. Obtain the tensor out as follows: Every element of out, where the sequence mask is 0, should also be 0. Every other element is obtained by dividing in_pooled through seq_mask_pooled element wise (not that the element of seq_mask_pooled is never 0 if the element of seq_mask is not).

    The tensor out can e.g. be calculated using tf.math.divide_no_nan.