Suppose you are given a numpy array
x = np.array([[1,2],[3,4]], dtype=np.int8)
and let's take its transpose.
y = x.T
My understanding from the numpy documetnation has been that tranpose only modifies the strides of the array, and not its underlying data buffer.
We can verify that by running
>> x.data.strides
(2, 1)
>> y.data.strides
(1, 2)
However, the data seems to be modified as well
>> x.data.tobytes()
b'\x01\x02\x03\x04'
>> y.data.tobytes()
b'\x01\x03\x02\x04'
when the expected behavior according to my understanding should be that y
's data buffer remains the same as that of x
, and only the strides change.
Why do we see a different data buffer for y
? Is perhaps the data
attribute not showing the underlying memory layout?
A better way to check the data buffer is with the __array_interface__
pointer:
In [8]: y=x.T
In [9]: x.__array_interface__
Out[9]:
{'strides': None,
'data': (144597512, False),
'shape': (2, 2),
'version': 3,
'typestr': '|i1',
'descr': [('', '|i1')]}
In [10]: y.__array_interface__
Out[10]:
{'strides': (1, 2),
'data': (144597512, False),
'shape': (2, 2),
'version': 3,
'typestr': '|i1',
'descr': [('', '|i1')]}
Docs to .data
are:
In [12]: x.data? memoryview(object) Create a new memoryview object which references the given object.
In [13]: x.data
Out[13]: <memory at 0xb2f7cb6c>
In [14]: y.data
Out[14]: <memory at 0xb2f7cbe4>
So y.data
isn't showing the bytes of its buffer, but bytes as traversed by strides. I'm not sure if there's a way of seeing the y
databuffer.
In [25]: y.base
Out[25]:
array([[1, 2],
[3, 4]], dtype=int8)
x
is CContiguous, y
is Fcontinguous.