Search code examples
pythonnumpybinaryscientific-computing

How can I understand the order of the result of `numpy.narray.view` if I use it to view 2 `np.uint8` as 1 `np.uint16`?


Let's say I have an array a with two elements of data type np.uint8. And I'd like to view this array as if its contents were of data type np.uint16. So I use the numpy.narray.view method:

import numpy as np

a = np.array([1, 2], dtype=np.uint8)
print(a.view(np.uint16))

This results in [513]. However, I expected this to be:

a is [               1,               2 ]
       0 0 0 0 0 0 0 1, 0 0 0 0 0 0 1 0

                    _______________ _______________
So a.view should be 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 (binary 1 then 2)
                                                258

Why is it the other way round?

                         _______________ _______________
a.view really results in 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 (binary 2 then 1)
                                                     513

Why is the order this way?


Solution

  • Thanks to the comments from @WarrenWeckesser and @hpaulj I figured it out:

    The two parts of the numpy.uint16 which can be interpreted as two numpy.uint8 don't necessarily lie "consecutively" in the memory. How the bytes are stored in the memory depends on the platform's endianness. For me, "consecutive" storing of the bytes seems intuitive, but is only true for big-endian platforms.

    See the Wikipedia article for further information on endianness: https://en.wikipedia.org/wiki/Endianness

    My platform's endianness is little-endian. So if I call a.view(np.uint16) this is equivalent to a.view("<u2"). The last part of the numpy.uint16 results in the first numpy.uint8 and the other way round, which yields 513. If I call a.view(">u2") I get the consecutive order for the two numpy.uint8 and therefore 258 as the result.

    > stands for big-endian, < stands for little-endian and u2 makes it an 16bit unsigned integer.

    For further reading, there are two suiting articles on SciPy.org...