I am looking for a way to vectorize the following code,
# Let cube have shape (N, M, M)
sub_arrays = np.empty(len(cube), 3, 3)
row_start = ... # Shape (N,) and are integers in range [0, M-2]
row_end = ... # Shape (N,) and are integers in range [1, M-1]
col_start = ... # Shape (N,) and are integers in range [0, M-2]
col_end = ... # Shape (N,) and are integers in range [1, M-1]
# extract sub arrays from cube and put them in sub_arrays
for i in range(len(cube)):
# Note that the below is extracting a (3, 3) sub array from cube
sub_arrays[i] = cube[i, row_start[i]:row_end[i], col_start[i]:col_end[i]]
Instead of the loop, I would like to do something like,
sub_arrays = cube[:, row_start:row_end, col_start:col_end]
But this throws the exception,
TypeError: only integer scalar arrays can be converted to a scalar index
Is there instead some valid way to vectorize the loop?
I believe this question is a duplicate of the one about Slicing along axis with varying indices. However, since it may not be obvious, I think it's okay to provide the answer in a new context with a somewhat different approach.
From what I can see, you want to extract data from the cube using a sliding window of a fixed size (3×3 in this case), applied to a separate slice along the first axis with varying shifts within the slices.
In contrast to the previously mentioned approach using as_strided
, let's use sliding_window_view
this time. As a result, we get two additional axes for row_start
and col_start
, followed by the window dimensions. Note that row_end
and col_end
appear as if they are equal to the corresponding starting points increased by a fixed square window side, which is 3 in this case:
from numpy.lib.stride_tricks import sliding_window_view
cube_view = sliding_window_view(cube, window_shape=(3, 3), axis=(1, 2))
output = cube_view[range(cube.shape[0]), row_start, col_start].copy()
That's all. But to be sure, let's compare the output with the original code, using test data:
import numpy as np
from numpy.random import randint
from numpy.lib.stride_tricks import sliding_window_view
n, m, w = 100, 10, 3 # w - square window size
row_start = randint(m-w+1, size=n)
col_start = randint(m-w+1, size=n)
# Test cube
cube = np.arange(n*m*m).reshape(n, m, m)
# Data to compare with
sub_arrays = np.empty((n, w, w), dtype=cube.dtype)
for i in range(cube.shape[0]):
sub_arrays[i] = cube[i, row_start[i]:row_start[i]+w, col_start[i]:col_start[i]+w]
# Subarrays from the sliding window view
cube_view = sliding_window_view(cube, window_shape=(w, w), axis=(1, 2))
output = cube_view[range(cube.shape[0]), row_start, col_start].copy()
# No exceptions should occur at this step
assert np.equal(output, sub_arrays).all()