Search code examples
pythonarraysnumpystride

How does numpy reshaping of a slice works


I'm trying to reimplement numpy from scratch but I can't figure out how slicing works exactly. Take as an example this array:

> a = np.array(list(range(0,20)))

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Let's reshape it to

> b = a.reshape(5,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

The strides here make sense

> b.strides

(32, 8) # float64 has 8 bytes, so row jumps by 4 floats and column moves by 1

> b.shape

(5, 4)

I can index to this array by using 4*row+col expression

for row in range(b.shape[0]):
    for col in range(b.shape[1]):
        assert a[4*row+col] == b[row,col]

Now let's take a slice

> c = b[1:,1:]

array([[ 5,  6,  7],
       [ 9, 10, 11],
       [13, 14, 15],
       [17, 18, 19]])

> c.strides

(32, 8) # It has the same strides, which sounds logical.

> c.shape

(4, 3)

I suppose that c just stores an offset to a. So indexing into this array would be equivalent to offset + 4*row+col

for row in range(c.shape[0]):
    for col in range(c.shape[1]):
        assert a[5 + 4*row+col] == c[row,col]

But the real magic happens when I perform reshape.

d = c.reshape(3,4)

array([[ 5,  6,  7,  9],
       [10, 11, 13, 14],
       [15, 17, 18, 19]])

Notice that 8, 12, and 16 are missing (as expected).

However, when I query strides

> d.strides

(32, 8)

> d.shape

(3, 4)

they look the same. Those strides are clearly lying. There is no way to index into a with such strides. So how does this magic happen? There must be some missing piece of information.

This must be implemented in some clever way. I can't believe that numpy would use some tricky wrapper objects. Especially if we consider that the exact same mechanism should be implemented in pytorch/tensorflow and these framework run on CUDA, which obviously cannot support advanced C++ magic with virtual classes and intermediate layers of abstraction. There must be some simple obvious trick to this that could run on GPU as well.


Solution

  • I like to look at a.__array_interface__ which includes a databuffer 'pointer'.. From that you can see the c offset. I suspect d is a copy, no longer a view. This should be obvious from interface.

    In [155]: a=np.arange(20)
    In [156]: a.__array_interface__
    Out[156]: 
    {'data': (34619344, False),
     'strides': None,
     'descr': [('', '<i8')],
     'typestr': '<i8',
     'shape': (20,),
     'version': 3}
    In [157]: b = a.reshape(5,4)
    In [158]: b.__array_interface__
    Out[158]: 
    {'data': (34619344, False),       # same
     'strides': None,
     'descr': [('', '<i8')],
     'typestr': '<i8',
     'shape': (5, 4),
     'version': 3}
    In [159]: c = b[1:,1:]
    In [160]: c.__array_interface__
    Out[160]: 
    {'data': (34619384, False),       # offset by 32+8
     'strides': (32, 8),
     'descr': [('', '<i8')],
     'typestr': '<i8',
     'shape': (4, 3),
     'version': 3}
    In [161]: d=c.reshape(3,4)
    In [162]: d.__array_interface__
    Out[162]: 
    {'data': (36537904, False),       # totally different
     'strides': None,
     'descr': [('', '<i8')],
     'typestr': '<i8',
     'shape': (3, 4),
     'version': 3}