Search code examples
arraysnumpybuffertransposestride

The effect of transposing a numpy array on its strides and data buffer


Suppose you are given a numpy array

x = np.array([[1,2],[3,4]], dtype=np.int8)

and let's take its transpose.

y = x.T

My understanding from the numpy documetnation has been that tranpose only modifies the strides of the array, and not its underlying data buffer.

We can verify that by running

>> x.data.strides
(2, 1)

>> y.data.strides
(1, 2)

However, the data seems to be modified as well

>> x.data.tobytes()
b'\x01\x02\x03\x04'

>> y.data.tobytes()
b'\x01\x03\x02\x04'

when the expected behavior according to my understanding should be that y's data buffer remains the same as that of x, and only the strides change.

Why do we see a different data buffer for y? Is perhaps the data attribute not showing the underlying memory layout?


Solution

  • A better way to check the data buffer is with the __array_interface__ pointer:

    In [8]: y=x.T
    In [9]: x.__array_interface__
    Out[9]: 
    {'strides': None,
     'data': (144597512, False),
     'shape': (2, 2),
     'version': 3,
     'typestr': '|i1',
     'descr': [('', '|i1')]}
    In [10]: y.__array_interface__
    Out[10]: 
    {'strides': (1, 2),
     'data': (144597512, False),
     'shape': (2, 2),
     'version': 3,
     'typestr': '|i1',
     'descr': [('', '|i1')]}
    

    Docs to .data are:

    In [12]: x.data? memoryview(object) Create a new memoryview object which references the given object.

    In [13]: x.data
    Out[13]: <memory at 0xb2f7cb6c>
    In [14]: y.data
    Out[14]: <memory at 0xb2f7cbe4>
    

    So y.data isn't showing the bytes of its buffer, but bytes as traversed by strides. I'm not sure if there's a way of seeing the y databuffer.

    In [25]: y.base
    Out[25]: 
    array([[1, 2],
           [3, 4]], dtype=int8)
    

    x is CContiguous, y is Fcontinguous.