I'm trying to implement 2D sliding window of cubic shape (k,k), to so I can iterate over a frame (n,m,3) and calculate the mean over the pixel values in each window. I want each iteration, that the window will present the next slide without any overlapping values; i.e. given this matrix:
[
[1, 2, 3, 4],
[5, 6, 7 ,8],
[9, 10, 11, 12],
[13, 14, 15, 16]
]
and for k = 2 I'll have something as follows:
[
[1, 2],
[5, 6]
]
and the second window's value as follows:
[
[3, 4],
[7, 8]
]
and so on.
I have tried using numpy.lib.stride_tricks.as_strided. but without any success.
also it is important to use numpy or any other efficient library since implementing this code with a python for-loop is too expensive for that operation.
Ignoring the third dimension, since it doesn't seem to enter the problem, how about:
# Generate array of size (m*k, n*k)
# my m, n are your m, n divided by k
m, n = 2, 4
k = 3
x = np.arange(m*n*k*k).reshape((m*k, n*k))
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
# [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
# [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
# [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47],
# [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
# [60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]])
# Calculate means of each block
y = x.reshape((m, k, n, k))
z = np.moveaxis(y, -3, -2) # now shape is (m, n, k, k)
np.mean(z, axis=(-2, -1)) # take mean over last two axes
# or just np.mean(y, axis=(-3, -1))
# array([[13., 16., 19., 22.],
# [49., 52., 55., 58.]])
To preserve the third dimension and do the calculations on each color channel separately, just move it out of the way at the beginning (e.g. to axis 0
), then move it back at the end.
If the number of rows and/or columns are not divisible by k
, you could pad the array with some sentinel value that doesn't appear in the data (e.g. nan
) until the number of rows/columns are divisible by k
. Then, take the means in some way that ignores the sentinel values (e.g. nanmean
). (Alternatively, split off the remainder rows/columns, handle them separately, and combine the results.)
There might be something in scipy.ndimage
or scikit-image that will do this in one line. I tried zoom
, but didn't find a magic combination of settings that gave the desired result.