Search code examples
pythonnumpyimplementationsliding-windowstride

python implementation of numpy.lib.stride_tricks.as_strided


I'm trying to translate the as_strided function of NumPy to a function in Python when I translate ahead the number of strides to the number of variables according to the type of the variable (for float32 I divide the stride by 4, etc).

The code I implemented:

def as_strided(x, shape, strides):
    x = x.flatten()
    size = 1
    for value in shape:
        size *= value
    arr = np.zeros(size, dtype=np.float32)
    curr = 0
    for i in range(shape[0]):
        for j in range(shape[1]):
            for k in range(shape[2]):
                arr[curr] = x[i * strides[0] + j * strides[1] + k * strides[2]]
                curr = curr + 1
    return np.reshape(arr, shape)

In order to test the function I wrote 2 auxiliary functions:

def sliding_window(x, shape, strides):
    f_mine = as_strided(x, shape, [stride // 4 for stride in strides])
    f_np = np.lib.stride_tricks.as_strided(x, shape=shape, strides=strides).copy()
    check_strides(x.flatten(), f_mine)
    check_strides(x.flatten(), f_np)
    return f_mine, f_np

def check_strides(original, strided):
    s1 = int(np.where(original == strided[1][0][0])[0])
    s2 = int(np.where(original == strided[0][1][0])[0])
    s3 = int(np.where(original == strided[0][0][1])[0])
    print([s1, s2, s3])
    return [s1, s2, s3]

In the main code, I selected some shape and strides values and ran 2 cases:

  1. Uploaded a .npy file that includes a matrix in float32 - variable x.
  2. Created random matrix of the same size and type as variable x - variable y.

When I check the strides of the resulting matrices I get a strange phenomenon. For case 1 - the final resulted strides obtained using the NumPy function are different from the required stride (and from my implementation). For case 2 - the outputs are identical.

The main code:

shape = (30, 818, 300)
strides = (4, 120, 120)

# case 1
x = np.load('x.npy')
s_mine, s_np = sliding_window(x, shape, strides)
print(np.array_equal(s_mine, s_np))

#case 2
y = np.random.randn(x.shape[0], x.shape[1]).astype(np.float32)
s_mine, s_np = sliding_window(y, shape, strides)
print(np.array_equal(s_mine, s_np))

Here you can find the x.npy file that causes the desired stride change in the numpy function. I'd be happy if anyone could explain to me why this is happening.


Solution

  • I downloaded x.npy and loaded it. And ran as_strided on y. I haven't looked at your code.

    Normally when playing with as_strided I like to look at the arrays, but in this case they are large enough that I'll focus more making sense the strides and shape.

    In [39]: x.shape, x.strides
    Out[39]: ((30, 1117), (4, 120))
    In [40]: y.shape, y.strides
    Out[40]: ((30, 1117), (4468, 4))
    

    I wondered where you got the

    shape = (30, 818, 300)
    strides = (4, 120, 120)
    

    OK the 30 is shared, but the 4 is only for x. And with those strides x looks like it's F ordered, may be even a transpose of a (1117,30) array. Your y, which was constructed with random, has the typical strides for C ordered array, 4 bytes for the inner, trailing dimension, and 4*1117 for the leading dimension.